Evaluation Accuracy around 50%

piyaliG commented 7 years ago

Hi, I am trying to run the default evaluation of the network. I am getting the following output:

th net/main.lua --file ../target_images/ --list config/file_half.json --load uni_image_np_50.t7 --inputsize 32 --inputch 4 --label 13 --datasize 32 --datach 4 --batch 16 --maxseq 40 --cuda --cudnn

[eval] data with 1364 seq [net] loading model uni_image_np_50.t7 nn.Sequencer @ nn.Recursor @ nn.MaskZero @ nn.Sequential { [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> output] (1): cudnn.SpatialConvolution(4 -> 32, 3x3, 2,2) (2): nn.SpatialBatchNormalization (4D) (32) (3): cudnn.ReLU (4): cudnn.SpatialConvolution(32 -> 64, 3x3, 2,2) (5): nn.SpatialBatchNormalization (4D) (64) (6): cudnn.ReLU (7): nn.SpatialDropout(0.400000) (8): cudnn.SpatialConvolution(64 -> 128, 3x3, 2,2) (9): nn.SpatialBatchNormalization (4D) (128) (10): cudnn.ReLU (11): nn.SpatialDropout(0.400000) (12): nn.Reshape(1152) (13): nn.Linear(1152 -> 512) (14): nn.BatchNormalization (2D) (512) (15): cudnn.ReLU (16): nn.Dropout(0.5, busy) (17): nn.Linear(512 -> 512) (18): nn.LSTM(512 -> 512) (19): nn.Dropout(0.5, busy) (20): nn.Linear(512 -> 13) (21): cudnn.LogSoftMax } [eval] accuracy 0.502874 label 01: 11 [ 11 39 15 27 06 00 00 02 01 00 00 00 00 ] label 02: 97 [ 00 97 01 01 00 00 00 01 00 00 00 00 00 ] label 03: 16 [ 00 67 16 07 02 00 00 01 07 00 00 00 00 ] label 04: 37 [ 00 13 02 37 01 00 00 04 43 00 00 00 00 ] label 05: 70 [ 01 08 02 04 70 00 01 02 12 00 00 00 00 ] label 06: 14 [ 01 03 00 05 00 14 13 06 58 00 00 00 00 ] label 07: 97 [ 00 00 00 01 00 00 97 02 00 00 00 00 00 ] label 08: 86 [ 00 03 02 03 01 00 00 86 03 00 00 00 00 ] label 09: 95 [ 00 02 00 01 00 00 00 01 95 00 00 00 00 ] label 10: 27 [ 00 41 00 03 01 00 01 01 25 27 00 00 00 ] label 11: 00 [ 02 33 18 42 01 00 00 03 01 00 00 00 00 ] label 12: -- label 13: 00 [ 100 00 00 00 00 00 00 00 00 00 00 00 00 ] Finished

I am curious to know why the evaluation accuracy I recieve using the default code base is very much off compared to the expected. Any pointers would be very helpful!

Thanks and Regards, Piyali

nyjinlee commented 7 years ago

I am also using the default model given in this repo and evaluating the default evaluation dataset, but mine gives accuracy of 0.711933. I am curious why the evaluation accuracy differs if we are using the same default model and dataset.

klen-copic commented 6 years ago

I also get 0.711933 overall accuracy and the following confusion matrix. Is this the right result?

If no: what could be the reason for such outcome?
If yes: why is it different to the results reported in the paper (making the assumption it should be comparable with EtE w/o Pooling results)?_

@simonwsw many thanks for publishing the evaluation code. More researchers should do the same! Stared your project!

[eval] accuracy 0.711933
    label 01: 58 [ 58  00  07  24  05  00  00  04  00  00  01  00  00 ]
    label 02: 96 [ 00  96  00  01  01  00  00  02  01  00  00  00  00 ]
    label 03: 62 [ 04  06  62  17  03  00  00  07  01  00  00  00  00 ]
    label 04: 73 [ 13  01  06  73  03  00  00  01  02  00  02  00  00 ]
    label 05: 80 [ 06  02  03  03  80  00  00  00  05  01  01  00  00 ]
    label 06: 43 [ 17  01  09  04  03  43  10  01  01  01  07  00  03 ]
    label 07: 98 [ 00  00  00  00  00  00  98  01  00  00  00  00  00 ]
    label 08: 91 [ 01  01  01  05  01  00  00  91  01  00  00  00  00 ]
    label 09: 93 [ 01  02  00  02  01  00  00  01  93  00  00  00  00 ]
    label 10: 64 [ 04  18  03  01  04  00  00  00  03  64  00  00  02 ]
    label 11: 25 [ 09  00  01  37  02  00  00  25  00  00  25  00  00 ]
    label 12: --
    label 13: 00 [ 100  00  00  00  00  00  00  00  00  00  00  00  00 ]
Finished

simonwsw / deep-soli

Evaluation Accuracy around 50% #17