HITS-AIN / PINK

Parallelized rotation and flipping INvariant Kohonen maps
GNU General Public License v3.0
21 stars 11 forks source link

Mapping produces weird summed euclidean distances #32

Closed RafaelMostert closed 5 years ago

RafaelMostert commented 5 years ago

It is harder for me to get the SOMs to converge during training. Especially since my diagnostics don't work as the mapping procedure leads to unexpected high numbers. Did something change in the calculation of the summed euclidean distance between image and neuron? (apart from pull request 15) To see where I go wrong in the mapping process, I made a small test case below:

# Create sample SOM
# Map a list containing a single test image to a sample SOM

import numpy as np
import struct
def create_test_som(som_path, som_data):

    # Show single neuron
    print('som_data shape:', np.shape(som_data))
    print(som_data)
    # Write SOM binary file
    assert som_path.endswith('.bin')
    with open(som_path,'wb') as file:
        # <file format version> 1 <data-type> <som layout> <neuron layout> <data>
        file.write(struct.pack('i' * 11, 2, #<file format version>
                               1, #<file type>
                               0, #<data type>
                               0, #<som layout>
                               2, 2,2, #<som dimensionality, width and height> 
                               0, #<neuron layout>
                               2, np.shape(som_data)[1], np.shape(som_data)[2])) #<neuron dimensionality and width> 

        file.write(struct.pack('f' * som_data.size, *som_data.flatten()))

def create_test_data(data_path, image_data):

    # Show single image (identical to the single SOM neuron)
    print('image_data shape:', np.shape(image_data))
    print(image_data)

    # Write SOM binary file
    assert data_path.endswith('.bin')
    with open(data_path,'wb') as file:
    # <file format version> 0 <data-type> <number of entries> <data layout> <data>
        file.write(struct.pack('i' * 8, 2, 0, 0, 1, 0, 2, np.shape(image_data)[1], np.shape(image_data)[2]))

        file.write(struct.pack('f' * image_data.size, *image_data.flatten()))

def map_test_data_to_test_som(gpu_id, som_width, som_height, data_path, som_path, map_path):
    print(f"CUDA_VISIBLE_DEVICES={gpu_id} Pink  -p 1 --som-width {som_width} --som-height {som_height}"
          f" --map {data_path} {map_path} {som_path}")

def load_som_mapping(map_path, verbose=True):
    # Create list of indexes to retrieve coordinates of the cut-outs
    cut_out_index = []
    with open(map_path, 'rb') as file:

        # <file format version> 2 <data-type> <number of entries> <som layout> <data>
        version, file_type, data_type, numberOfImages, som_layout, som_dimensionality = struct.unpack(
            'i' * 6, file.read(4 * 6)) 
        som_dimensions = struct.unpack('i' * som_dimensionality, file.read(4 *
             som_dimensionality))
        if verbose:

            print('version:', version)
            print('file_type:', file_type)
            print('data_type:', data_type)
            print('numberOfImages:', numberOfImages)
            print('som_layout:', som_layout)
            print('som dimensionality:', som_dimensionality)
            print('som dimensions:', som_dimensions)

        som_width = som_dimensions[0]
        som_height = som_dimensions[1] if som_dimensionality > 1 else 1
        som_depth = som_dimensions[2] if som_dimensionality > 2 else 1

        maps = []
        assert file_type == 2 # 2 equals mapping

        map_size = som_width * som_height * som_depth

        data_map = np.ones((numberOfImages, map_size))
        for i in range(numberOfImages):
            for t in range(map_size):
                try:
                    data_map[i,t] = struct.unpack_from("f", file.read(4))[0]
                    if t == 0:
                        cut_out_index.append(i) # add index
                except:
                    pass
        data_map = data_map[:len(cut_out_index)]
    print('\nMap result:')
    print(data_map, numberOfImages, som_width, som_height, som_depth,'\n')
    return data_map, numberOfImages, som_width, som_height, som_depth

gpu_id = 1 
som_width=2
som_height=2
image_width=3
neuron_width = int(np.ceil(image_width/np.sqrt(2)*2))
som_path = '/data/single_neuron_som.bin'
data_path = '/data/single_image.bin'
map_path = '/data/map_single_image_to_single_neuron.bin'

create_test_som(som_path, 
                np.array([[np.ones([neuron_width,neuron_width]) 
                           for j in range(som_width)] 
                          for i in range(som_height)]))
create_test_data(data_path, np.array([np.zeros([image_width,image_width])]))
map_test_data_to_test_som(gpu_id, som_width,som_height,data_path, som_path, map_path)

This results in the following output:

som_data shape: (2, 2, 5, 5)
[[[[1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]]

  [[1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]]]

 [[[1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]]

  [[1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1.]]]]
image_data shape: (1, 3, 3)
[[[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]]
CUDA_VISIBLE_DEVICES=1 Pink  -p 1 --som-width 2 --som-height 2 --map /data/single_image.bin /data/map_single_image_to_single_neuron.bin /data/single_neuron_som.bin

Executing the last line gives the following output, which seem fine (dimensions seem right):

  Data file = /data/single_image.bin
  Result file = /data/map_single_image_to_single_neuron.bin
  SOM file = /data/single_neuron_som.bin
  Number of data entries = 1
  Data dimension = 3 x 3
  SOM dimension (width x height x depth) = 2x2x1
  SOM size = 4
  Number of iterations = 1
  Neuron dimension = 5x5
  Euclidean distance dimension = 2x2
  Number of progress information prints = 1
  Intermediate storage of SOM = off
  Layout = cartesian
  Initialization type = file_init
  SOM initialization file = /data/mostertrij/pink-basics/output_LoTSS_DR1/single_neuron_som.bin
  Interpolation type = bilinear
  Seed = 1234
  Number of rotations = 360
  Use mirrored image = 1
  Number of CPU threads = 40
  Use CUDA = 1
  Distribution function for SOM update = gaussian
  Sigma = 1.1
  Damping factor = 0.2
  Maximum distance for SOM update = -1
  Use periodic boundary conditions = 0
  Store best rotation and flipping parameters = 0

[======================================================================] 100 % 0 s

  Total time (hh:mm:ss): 00:00:02.087     (2 s)
  Successfully finished. Have a nice day.

Now loading the som using load_som_mapping(map_path, verbose=True) results in the following (wrong) result:

version: 2
file_type: 2
data_type: 0
numberOfImages: 1
som_layout: 0
som dimensionality: 2
som dimensions: (2, 2)

Map result:
[[0. 0. 0. 0.]] 1 2 2 1 

So the euclidean distance reported is 0,0,0,0 while it should be 4,4,4,4.

Repeating this proces for neurons and a single image all randomly sampled between 0 and 1 gives rise to the way to high numbers:

create_test_som(som_path2, np.array([[np.random.uniform(0,1,[neuron_width,neuron_width]) 
                                      for j in range(som_width)] 
                                     for i in range(som_height)]))
create_test_data(data_path2, np.array([np.random.uniform(0,1,[image_width,image_width])]))
map_test_data_to_test_som(gpu_id, som_width,som_height, data_path2, som_path2, map_path2)

Produces:

som_data shape: (2, 2, 5, 5)
[[[[0.99971204 0.9649601  0.13961416 0.5277925  0.69232026]
   [0.30195558 0.95923107 0.33910395 0.79623785 0.41668984]
   [0.2981384  0.37406349 0.1064496  0.81195063 0.89717293]
   [0.57093443 0.45074527 0.9573843  0.7156923  0.19820787]
   [0.19550703 0.21997696 0.85291868 0.37672202 0.5505964 ]]

  [[0.15731968 0.17053242 0.71608575 0.56746897 0.66786446]
   [0.58221214 0.72226106 0.74181827 0.32114466 0.92622553]
   [0.17948228 0.9350424  0.95751021 0.80663197 0.9809329 ]
   [0.7201585  0.76561495 0.00955184 0.50519959 0.96674996]
   [0.25371928 0.71401906 0.49218354 0.11929986 0.95515874]]]

 [[[0.89801856 0.02230665 0.30972996 0.4629857  0.89403677]
   [0.86597969 0.93576617 0.1125303  0.68526227 0.35861138]
   [0.69389897 0.07442639 0.1937746  0.60510161 0.52935543]
   [0.77156129 0.93942363 0.71076085 0.41047543 0.64435329]
   [0.89491456 0.79810935 0.89999227 0.808086   0.59111661]]

  [[0.2528187  0.42403009 0.36152541 0.84878211 0.43277483]
   [0.85728984 0.16706833 0.81121466 0.94590693 0.52366488]
   [0.384645   0.04069272 0.3186065  0.68401763 0.58106253]
   [0.97269966 0.80036697 0.55934164 0.71930493 0.27985202]
   [0.5184996  0.79485326 0.78941677 0.31408703 0.37666354]]]]
image_data shape: (1, 3, 3)
[[[0.28589343 0.06328625 0.6278434 ]
  [0.82636295 0.64509669 0.04756091]
  [0.28352401 0.23569587 0.96175407]]]
CUDA_VISIBLE_DEVICES=1 Pink  -p 1 --som-width 2 --som-height 2 --map /data/single_image2.bin /data/map_single_image_to_single_neuron2.bin /data/single_neuron_som2.bin
version: 2
file_type: 2
data_type: 0
numberOfImages: 1
som_layout: 0
som dimensionality: 2
som dimensions: (2, 2)

Mapping result is:

Map result:
[[ 2001. 11979. 20095. 41844.]] 1 2 2 1 

While it should be four times a value between 0 and 4.

It would be great if you could spot what I do wrong (in creating the binaries, at runtime or while loading the mapping).

BerndDoser commented 5 years ago

Thank you very much for the awesome testing. Let me simplify your first example to be sure, that I understand it correctly. If we have a neuron with only 1.0 entries and a image with only 0.0 entries, then the euclidean distance should be 0.0 and not 4.0 as you expected.

In your second example with random numbers, I would suppose, that the high number results from the the special behavior using unit8 to speed-up the euclidean norm calculation, where I have to normalize the float into uint8. I would expect, that the numbers are correct, but not scaled back. I will add the back scaling from uint into float in the next days. You can simply check the scaled numbers by using --euclidean-distance-type float.

RafaelMostert commented 5 years ago

Thanks for looking into it! And good to know about the --euclidean-distance-type float flag.

But I already differ from your result in the first basic example. What goes wrong in the following argument?

Using the PINK paper: Given neuron with weights A and the image vector B. The similarity measure Δ(A,B)=min{d(A,φ(B))|φ∈Φ} with Φ ={φ1,...,φN} a set of image rotations and d(A,B) the Euclidean distance between A and B. Since A is all 1.0s and B all 0s, the rotations have no effect and Δ(A,B)=d(A,B). Now for every weight within the euclidean distance dimension, d(1,0)=(1-0)^2=1 is added to the summed euclidean distance. In my example above, the image was 3x3 so the Euclidean distance dimension = 2x2 so I expect the summed euclidean distance to each neuron to be 4.

BerndDoser commented 5 years ago

I added --euclidean-distance-type to the usage print, but it will appears only in version 2.3.

Of course, you are right for the first example. The result must be 4. I will check it.

RafaelMostert commented 5 years ago

I just tested the --euclidean-distance-type float flag and it works :) The first example now gives 4,4,4,4 and the second example (with np.random.seed=42) gives a bunch of floats between 0 and 4: 0.43070784 0.09937809 0.10547281 0.39434844

Disabling rotations with the -n 1 flag in the second example should lead to higher values and indeed the results are higher: 1.05128109 0.96608406 0.91012907 0.58507311

RafaelMostert commented 5 years ago

Oops, I forgot the square root over the sum for the Euclidean distance. d(1,0)=sqrt((1-0)^2)=1 In my first example the effective area is A = [1,1,1,1] and B=[0,0,0,0] so for d(A,B)=sqrt((1-0)^2+(1-0)^2+(1-0)^2+(1-0)^2)=2 not 4. So Pink seems wrong in the first example, even with --euclidean-distance-type float.

And testing the example in which the neurons are all zeros and the image all 0.5, the result should be d(A,B)=sqrt((0.5-0)^2+(0.5-0)^2+(0.5-0)^2+(0.5-0)^2)=1. But Pink reports 0.74999994 0.74999994 0.74999994 0.74999994

So there might be something wrong still.

RafaelMostert commented 5 years ago

More weird things:

RafaelMostert commented 5 years ago

I did some more testing. It seems as if problems from my last comment get introduced only with the -n 1 flag for images of all sizes, big or small. (For -n 1, the euclidean distance dimension and the neuron dimension are automatically set to the image dimension.) With -n 4 or -n 360 d(A,B) is equal to d(B,A). The --flip-off flag does not affect the results.

Catching -n <4 would be good.

If -n >= 4 and --euclidean-distance-type float the reported results are the squares of the euclidean dimension. In those cases, the only bug to fix there is inserting a square root in PINK.

BerndDoser commented 5 years ago

In the meanwhile I have fixed the scaling issue for euclidean-distance-type unit8/16.

I have also added some checks for d(A,B) == d(B,A). Please find them here:

https://github.com/HITS-AIN/PINK/blob/7a917099c380336506e883d28045ea53f6757be5/test/SelfOrganizingMapTest/Mapper.cpp#L80-L94

I see only d(A,B) != d(B,A) if the euclidean-distance-dim is too large (should be not the cause with default settings) and -n > 4 is used.

The sqrt for the euclidean distance was consciously omitted, as it has no direct effect. Therefore, d(0,1) = 4 is correct (without sqrt).

For -n 1 (no rotations) the setting euclidean_distance_dim = neuron_dim = image_dim makes sence, because there are no rotational effects as discussed in #15.

RafaelMostert commented 5 years ago

You are right, the behaviour of -n < 4 is according to pull request 15. Might be nice to indicate that the sqrt is missing somewhere in the documentation at some point.

Great! All mapping issues are fixed or sorted out then, thanks a lot :)

BerndDoser commented 5 years ago

You are welcome.

The sqrt was omitted for performance reasons, which is in the training phase no problem, as the euclidean distance is not used directly. I think it is better to add the sqrt in the mapping phase to be consistent with the paper. Should be the better way to prevent future confusion :)