mil-tokyo / webdnn

The Fastest DNN Running Framework on Web Browser
https://mil-tokyo.github.io/webdnn
Other
1.97k stars 145 forks source link

Very different (and wrong) answers from SEResNet #670

Open kylemcdonald opened 6 years ago

kylemcdonald commented 6 years ago

I exported a 73-label multilabel network. It was made with Keras using sigmoid outputs. I get very different outputs from each backend, and they are all wrong.

Toolkit True False
Keras 0.657 0.008
webgl 0.259 0.322
webasm 0.373 0.387
webgpu 0.185 0.184

I'm using the SEResNet architecture.

I'm setting up the runner with:

runner.getInputViews()[0].set(await WebDNN.Image.getImageArray(roi, {
  order: WebDNN.Image.Order.HWC,
  color: WebDNN.Image.Color.COLOR,
  scale: [255,255,255]
}));

Here's the entire exported backend and test code if it's helpful: https://www.dropbox.com/s/wf830x6ah7rp6bx/lfwa%2B.zip?dl=0

Here's the model.summary():

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_46 (InputLayer)            (None, 96, 96, 3)     0                                            
____________________________________________________________________________________________________
conv2d_1027 (Conv2D)             (None, 48, 48, 64)    9408        input_46[0][0]                   
____________________________________________________________________________________________________
max_pooling2d_46 (MaxPooling2D)  (None, 24, 24, 64)    0           conv2d_1027[0][0]                
____________________________________________________________________________________________________
batch_normalization_855 (BatchNo (None, 24, 24, 64)    256         max_pooling2d_46[0][0]           
____________________________________________________________________________________________________
activation_855 (Activation)      (None, 24, 24, 64)    0           batch_normalization_855[0][0]    
____________________________________________________________________________________________________
conv2d_1029 (Conv2D)             (None, 24, 24, 51)    29376       activation_855[0][0]             
____________________________________________________________________________________________________
batch_normalization_856 (BatchNo (None, 24, 24, 51)    204         conv2d_1029[0][0]                
____________________________________________________________________________________________________
activation_856 (Activation)      (None, 24, 24, 51)    0           batch_normalization_856[0][0]    
____________________________________________________________________________________________________
conv2d_1030 (Conv2D)             (None, 24, 24, 51)    23409       activation_856[0][0]             
____________________________________________________________________________________________________
global_average_pooling2d_362 (Gl (None, 51)            0           conv2d_1030[0][0]                
____________________________________________________________________________________________________
reshape_342 (Reshape)            (None, 1, 1, 51)      0           global_average_pooling2d_362[0][0
____________________________________________________________________________________________________
dense_728 (Dense)                (None, 1, 1, 3)       153         reshape_342[0][0]                
____________________________________________________________________________________________________
dense_729 (Dense)                (None, 1, 1, 51)      153         dense_728[0][0]                  
____________________________________________________________________________________________________
multiply_342 (Multiply)          (None, 24, 24, 51)    0           conv2d_1030[0][0]                
                                                                   dense_729[0][0]                  
____________________________________________________________________________________________________
conv2d_1028 (Conv2D)             (None, 24, 24, 51)    3264        activation_855[0][0]             
____________________________________________________________________________________________________
add_342 (Add)                    (None, 24, 24, 51)    0           multiply_342[0][0]               
                                                                   conv2d_1028[0][0]                
____________________________________________________________________________________________________
batch_normalization_857 (BatchNo (None, 24, 24, 51)    204         add_342[0][0]                    
____________________________________________________________________________________________________
activation_857 (Activation)      (None, 24, 24, 51)    0           batch_normalization_857[0][0]    
____________________________________________________________________________________________________
conv2d_1032 (Conv2D)             (None, 12, 12, 103)   47277       activation_857[0][0]             
____________________________________________________________________________________________________
batch_normalization_858 (BatchNo (None, 12, 12, 103)   412         conv2d_1032[0][0]                
____________________________________________________________________________________________________
activation_858 (Activation)      (None, 12, 12, 103)   0           batch_normalization_858[0][0]    
____________________________________________________________________________________________________
conv2d_1033 (Conv2D)             (None, 12, 12, 103)   95481       activation_858[0][0]             
____________________________________________________________________________________________________
global_average_pooling2d_363 (Gl (None, 103)           0           conv2d_1033[0][0]                
____________________________________________________________________________________________________
reshape_343 (Reshape)            (None, 1, 1, 103)     0           global_average_pooling2d_363[0][0
____________________________________________________________________________________________________
dense_730 (Dense)                (None, 1, 1, 6)       618         reshape_343[0][0]                
____________________________________________________________________________________________________
dense_731 (Dense)                (None, 1, 1, 103)     618         dense_730[0][0]                  
____________________________________________________________________________________________________
multiply_343 (Multiply)          (None, 12, 12, 103)   0           conv2d_1033[0][0]                
                                                                   dense_731[0][0]                  
____________________________________________________________________________________________________
conv2d_1031 (Conv2D)             (None, 12, 12, 103)   5253        activation_857[0][0]             
____________________________________________________________________________________________________
add_343 (Add)                    (None, 12, 12, 103)   0           multiply_343[0][0]               
                                                                   conv2d_1031[0][0]                
____________________________________________________________________________________________________
batch_normalization_859 (BatchNo (None, 12, 12, 103)   412         add_343[0][0]                    
____________________________________________________________________________________________________
activation_859 (Activation)      (None, 12, 12, 103)   0           batch_normalization_859[0][0]    
____________________________________________________________________________________________________
conv2d_1034 (Conv2D)             (None, 12, 12, 103)   95481       activation_859[0][0]             
____________________________________________________________________________________________________
batch_normalization_860 (BatchNo (None, 12, 12, 103)   412         conv2d_1034[0][0]                
____________________________________________________________________________________________________
activation_860 (Activation)      (None, 12, 12, 103)   0           batch_normalization_860[0][0]    
____________________________________________________________________________________________________
conv2d_1035 (Conv2D)             (None, 12, 12, 103)   95481       activation_860[0][0]             
____________________________________________________________________________________________________
global_average_pooling2d_364 (Gl (None, 103)           0           conv2d_1035[0][0]                
____________________________________________________________________________________________________
reshape_344 (Reshape)            (None, 1, 1, 103)     0           global_average_pooling2d_364[0][0
____________________________________________________________________________________________________
dense_732 (Dense)                (None, 1, 1, 6)       618         reshape_344[0][0]                
____________________________________________________________________________________________________
dense_733 (Dense)                (None, 1, 1, 103)     618         dense_732[0][0]                  
____________________________________________________________________________________________________
multiply_344 (Multiply)          (None, 12, 12, 103)   0           conv2d_1035[0][0]                
                                                                   dense_733[0][0]                  
____________________________________________________________________________________________________
add_344 (Add)                    (None, 12, 12, 103)   0           multiply_344[0][0]               
                                                                   add_343[0][0]                    
____________________________________________________________________________________________________
batch_normalization_861 (BatchNo (None, 12, 12, 103)   412         add_344[0][0]                    
____________________________________________________________________________________________________
activation_861 (Activation)      (None, 12, 12, 103)   0           batch_normalization_861[0][0]    
____________________________________________________________________________________________________
conv2d_1037 (Conv2D)             (None, 6, 6, 207)     191889      activation_861[0][0]             
____________________________________________________________________________________________________
batch_normalization_862 (BatchNo (None, 6, 6, 207)     828         conv2d_1037[0][0]                
____________________________________________________________________________________________________
activation_862 (Activation)      (None, 6, 6, 207)     0           batch_normalization_862[0][0]    
____________________________________________________________________________________________________
conv2d_1038 (Conv2D)             (None, 6, 6, 207)     385641      activation_862[0][0]             
____________________________________________________________________________________________________
global_average_pooling2d_365 (Gl (None, 207)           0           conv2d_1038[0][0]                
____________________________________________________________________________________________________
reshape_345 (Reshape)            (None, 1, 1, 207)     0           global_average_pooling2d_365[0][0
____________________________________________________________________________________________________
dense_734 (Dense)                (None, 1, 1, 12)      2484        reshape_345[0][0]                
____________________________________________________________________________________________________
dense_735 (Dense)                (None, 1, 1, 207)     2484        dense_734[0][0]                  
____________________________________________________________________________________________________
multiply_345 (Multiply)          (None, 6, 6, 207)     0           conv2d_1038[0][0]                
                                                                   dense_735[0][0]                  
____________________________________________________________________________________________________
conv2d_1036 (Conv2D)             (None, 6, 6, 207)     21321       activation_861[0][0]             
____________________________________________________________________________________________________
add_345 (Add)                    (None, 6, 6, 207)     0           multiply_345[0][0]               
                                                                   conv2d_1036[0][0]                
____________________________________________________________________________________________________
batch_normalization_863 (BatchNo (None, 6, 6, 207)     828         add_345[0][0]                    
____________________________________________________________________________________________________
activation_863 (Activation)      (None, 6, 6, 207)     0           batch_normalization_863[0][0]    
____________________________________________________________________________________________________
conv2d_1039 (Conv2D)             (None, 6, 6, 207)     385641      activation_863[0][0]             
____________________________________________________________________________________________________
batch_normalization_864 (BatchNo (None, 6, 6, 207)     828         conv2d_1039[0][0]                
____________________________________________________________________________________________________
activation_864 (Activation)      (None, 6, 6, 207)     0           batch_normalization_864[0][0]    
____________________________________________________________________________________________________
conv2d_1040 (Conv2D)             (None, 6, 6, 207)     385641      activation_864[0][0]             
____________________________________________________________________________________________________
global_average_pooling2d_366 (Gl (None, 207)           0           conv2d_1040[0][0]                
____________________________________________________________________________________________________
reshape_346 (Reshape)            (None, 1, 1, 207)     0           global_average_pooling2d_366[0][0
____________________________________________________________________________________________________
dense_736 (Dense)                (None, 1, 1, 12)      2484        reshape_346[0][0]                
____________________________________________________________________________________________________
dense_737 (Dense)                (None, 1, 1, 207)     2484        dense_736[0][0]                  
____________________________________________________________________________________________________
multiply_346 (Multiply)          (None, 6, 6, 207)     0           conv2d_1040[0][0]                
                                                                   dense_737[0][0]                  
____________________________________________________________________________________________________
add_346 (Add)                    (None, 6, 6, 207)     0           multiply_346[0][0]               
                                                                   add_345[0][0]                    
____________________________________________________________________________________________________
batch_normalization_865 (BatchNo (None, 6, 6, 207)     828         add_346[0][0]                    
____________________________________________________________________________________________________
activation_865 (Activation)      (None, 6, 6, 207)     0           batch_normalization_865[0][0]    
____________________________________________________________________________________________________
conv2d_1042 (Conv2D)             (None, 3, 3, 103)     191889      activation_865[0][0]             
____________________________________________________________________________________________________
batch_normalization_866 (BatchNo (None, 3, 3, 103)     412         conv2d_1042[0][0]                
____________________________________________________________________________________________________
activation_866 (Activation)      (None, 3, 3, 103)     0           batch_normalization_866[0][0]    
____________________________________________________________________________________________________
conv2d_1043 (Conv2D)             (None, 3, 3, 103)     95481       activation_866[0][0]             
____________________________________________________________________________________________________
global_average_pooling2d_367 (Gl (None, 103)           0           conv2d_1043[0][0]                
____________________________________________________________________________________________________
reshape_347 (Reshape)            (None, 1, 1, 103)     0           global_average_pooling2d_367[0][0
____________________________________________________________________________________________________
dense_738 (Dense)                (None, 1, 1, 6)       618         reshape_347[0][0]                
____________________________________________________________________________________________________
dense_739 (Dense)                (None, 1, 1, 103)     618         dense_738[0][0]                  
____________________________________________________________________________________________________
multiply_347 (Multiply)          (None, 3, 3, 103)     0           conv2d_1043[0][0]                
                                                                   dense_739[0][0]                  
____________________________________________________________________________________________________
conv2d_1041 (Conv2D)             (None, 3, 3, 103)     21321       activation_865[0][0]             
____________________________________________________________________________________________________
add_347 (Add)                    (None, 3, 3, 103)     0           multiply_347[0][0]               
                                                                   conv2d_1041[0][0]                
____________________________________________________________________________________________________
batch_normalization_867 (BatchNo (None, 3, 3, 103)     412         add_347[0][0]                    
____________________________________________________________________________________________________
activation_867 (Activation)      (None, 3, 3, 103)     0           batch_normalization_867[0][0]    
____________________________________________________________________________________________________
global_max_pooling2d_26 (GlobalM (None, 103)           0           activation_867[0][0]             
____________________________________________________________________________________________________
dense_740 (Dense)                (None, 73)            7519        global_max_pooling2d_26[0][0]    
====================================================================================================
Total params: 2,111,171
Trainable params: 2,107,947
Non-trainable params: 3,224
Kiikurage commented 6 years ago

I'll investigate. What do True and False means (Where are they from)?

kylemcdonald commented 6 years ago

Those are the labels of the input image. In this case, I'm predicted 73 attributes from the LFWA+ dataset. "True" means "smiling" and "False" means "not smiling". I provided example images.

Kiikurage commented 6 years ago

Thanks. I'll investigate more.

Kiikurage commented 6 years ago

Could you share more information (model h5 file, train script, etc.) with me if possible?

kylemcdonald commented 6 years ago

Here is an h5 file and the notebook I used for training, and export. Sorry it's so messy!

https://www.dropbox.com/s/tdv8jgor4gl9jnm/checkpoint-2.42-04.h5?dl=0 https://www.dropbox.com/s/ggt68cod8qzu1pp/SEResNet-train%2Bexport.zip?dl=0

Kiikurage commented 6 years ago

Sorry but the h5 file you shared may be the other model. its output shape is (batch_size, 20).

kylemcdonald commented 6 years ago

Sorry, it's my fault! Here is the correct .h5 file:

https://www.dropbox.com/s/lm7k336mt6vqfhz/checkpoint-0.29-04.h5?dl=0

The other one uses the same input data, but has a softmax output across a different distribution.

kylemcdonald commented 6 years ago

I’m still trying to get this to work, but I feel like I must have made some simple mistake... :(