HUJI-Deep / simnets-tf

SimNets implementation in TensorFlow
MIT License
7 stars 3 forks source link

Cannot reproduce the cifar10 result in Deep SimNet #23

Open CuriousCat-7 opened 6 years ago

CuriousCat-7 commented 6 years ago

Hello, I am trying to involve SimNet in my project. While there are some problems when I tried to reproduce the results listed in Deep SimNet The structure of my one-layer SimNet is:

 model = Sequential()                                                                                                                                                                                       
 model.add(InputLayer(input_shape=(3,32,32)))                                                                                                                                                               
 model.add(Conv2D(filters=32, kernel_size=(5,5), padding='same', use_bias=None,\                                                                                                                            
                                  data_format='channels_first', strides=(1,1)))                                                                                                                                                          
 model.add(sk.Similarity(32, blocks=[1,1], strides=[1,1],\
                         similarity_function='L2',normalization_term=True, padding=[1,1],\                                                                                 
                         out_of_bounds_value=np.nan, ignore_nan_input=True ))                                                                                                                                                   
 model.add(sk.Mex(32,                                                                                                                                                                                       
               blocks=[1, 3, 3], strides=[32, 2, 2],                                                                                                                                                        
               softmax_mode=False, normalize_offsets=True,                                                                                                                                                  
               use_unshared_regions=True, unshared_offset_region=[2]))                                                                                                                                      
 model.add(sk.Mex(10,                                                                                                                                                                                       
               blocks=[1, 16, 16], strides=[32, 16, 16],                                                                                                                                                    
               softmax_mode=True, normalize_offsets=True,                                                                                                                                                   
               use_unshared_regions=True, unshared_offset_region=[2]))                                                                                                                                      
 model.add(Flatten(data_format='channels_first'))      

with

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 3, 32, 32)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 32, 32)        2400      
_________________________________________________________________
similarity_1 (Similarity)    (None, 32, 34, 34)        2048      
_________________________________________________________________
mex_1 (Mex)                  (None, 32, 16, 16)        1152      
_________________________________________________________________
mex_2 (Mex)                  (None, 10, 1, 1)          2560      
_________________________________________________________________
flatten_1 (Flatten)          (None, 10)                0         
=================================================================
Total params: 8,160
Trainable params: 8,160
Non-trainable params: 0
_________________________________________________________________

But, the accuracy will always be 10%, which is a perfect random guess of CIFAR10. (At least, it converged to something.....) Could you tell me what's wrong with my definition? And Could you tell me the whole hyper-parameters in your work?

Thanks, Neo