can't reproduce results with pre-trained models

mesnilgr commented 8 years ago

Hi - thanks a lot for releasing your code. I downloaded the img_mean + parameters of your model. As a sanity check, I just ran validate_performance with your model but I can't reproduce the accuracy.

Here is the output of the script:

... building the model
conv (cudnn) layer with shape_in: (3, 227, 227, 256)
conv (cudnn) layer with shape_in: (96, 27, 27, 256)
conv (cudnn) layer with shape_in: (256, 13, 13, 256)
conv (cudnn) layer with shape_in: (384, 13, 13, 256)
conv (cudnn) layer with shape_in: (384, 13, 13, 256)
fc layer with num_in: 9216 num_out: 4096
dropout layer with P_drop: 0.5
fc layer with num_in: 4096 num_out: 4096
dropout layer with P_drop: 0.5
softmax layer with num_in: 4096 num_out: 1000
... training
weight loaded: W_0_65
weight loaded: b_0_65
weight loaded: W0_1_65
weight loaded: W1_1_65
weight loaded: b0_1_65
weight loaded: b1_1_65
weight loaded: W_2_65
weight loaded: b_2_65
weight loaded: W0_3_65
weight loaded: W1_3_65
weight loaded: b0_3_65
weight loaded: b1_3_65
weight loaded: W0_4_65
weight loaded: W1_4_65
weight loaded: b0_4_65
weight loaded: b1_4_65
weight loaded: W_5_65
weight loaded: b_5_65
weight loaded: W_6_65
weight loaded: b_6_65
weight loaded: W_7_65
weight loaded: b_7_65

# printing the current filename for validation
/mnt/imagenet/val_hkl_b256_b_256/0000.hkl loaded

# prediction of the network to check with the labels if it was just an index offset problem

prediction [147 778 278 778 490 671 986 976  85 121 678 218 429 628 582 973 741 404
 910 247 439 384 602 865 459 528 781 681 985 467 998 803 309 870 870 654
 311 945 642 989 685 114 422 909 643 996 980  55  77 597 610 401 110 989
 952 649 989 985 541 469 582 871 697 985 812 302 860 586 425 416 518 732
  64  27 984 832 124 948 946 989 669 946 286 485 844 324 855 481 359 738
 949 932 786 324 251 769 122 554 399  58 985 853 510 946 388 325 979  88
 687 986 723 806 199 874 945 640  59 613 317  17 936 678   1 689 780 418
 636 738 109 923 986 723 855 408 310 509 865 695 979 389 946  78  47 118
 690 630 507 631 949 897 122   0 766 943  82 553 995 945 615 813 842 777
 374 298 986 932 536 759 379 458 665 990 611 952 419 678 996 336 951 446
 985 114 383 897 278 805 288 749 589 328 723 648 677 322 409 996 569 357
 672 951 301  77 556 313 644 324 498 291 942 461 677 956 742 954 702 101
 424 836 986 796 956 177 798 879 292  57 667 902 760 825 707 996 678 427
 532 942 945 611 949 937 303 308 307 375 324 388 492 279 945 856 160 997
 990 852 553 990]

label ground truth [ 65 970 230 809 516  57 334 415 674 332 109 286 370 757 595 147 108  23
 478 517 334 173 948 727  23 846 270 167  55 858 324 573 150 981 586 887
  32 398 777  74 516 756 129 198 256 725 565 167 717 394  92  29 844 591
 358 468 259 994 872 588 474 183 107  46 842 390 101 887 870 841 467 149
  21 476  80 424 159 275 175 461 970 160 788  58 479 498 369  28 487  50
 270 383 366 780 373 705 330 142 949 349 473 159 872 878 201 906  70 486
 632 608 122 720 227 686 173 959 638 646 664 645 718 483 852 392 311 457
 352  22 934 283 802 553 276 236 751 343 528 328 969 558 163 328 771 726
 977 875 265 686 590 975 620 637  39 115 937 272 277 763 789 646 213 493
 647 504 937 687 781 666 583 158 825 212 659 257 436 196 140 248 339 230
 361 544 935 638 627 289 867 272 103 584 180 703 449 771 118 396 934  16
 548 993 704 457 233 401 827 376 146 606 922 516 284 889 475 978 475 984
  16  77 610 254 636 662 473 213  25 463 215 173  35 741 125 787 289 425
 973   1 167 121 445 702 532 366 678 764 125 349  13 179 522 493 989 720
 438 660 983 533]

# error on the first valid batch
validation error 100.000000 %
top 5 validation error 100.000000 %
validation loss 16.122633

Any idea on what went wrong here? It goes off for the rest of the dataset too.

mesnilgr commented 8 years ago

the synset mapping looks correct (it's the same as caffe):

head val.txt
ILSVRC2010_val_00000001.JPEG 65
ILSVRC2010_val_00000002.JPEG 970
ILSVRC2010_val_00000003.JPEG 230
ILSVRC2010_val_00000004.JPEG 809
ILSVRC2010_val_00000005.JPEG 516
ILSVRC2010_val_00000006.JPEG 57
ILSVRC2010_val_00000007.JPEG 334
ILSVRC2010_val_00000008.JPEG 415
ILSVRC2010_val_00000009.JPEG 674
ILSVRC2010_val_00000010.JPEG 332

versus caffe:

head caffe/data/ilsvrc12/val.txt
ILSVRC2012_val_00000001.JPEG 65
ILSVRC2012_val_00000002.JPEG 970
ILSVRC2012_val_00000003.JPEG 230
ILSVRC2012_val_00000004.JPEG 809
ILSVRC2012_val_00000005.JPEG 516
ILSVRC2012_val_00000006.JPEG 57
ILSVRC2012_val_00000007.JPEG 334
ILSVRC2012_val_00000008.JPEG 415
ILSVRC2012_val_00000009.JPEG 674
ILSVRC2012_val_00000010.JPEG 332

hma02 commented 8 years ago

Hi @mesnilgr I just did a test on the validation set using 256 batch size data. I didn't reproduce you problem. Here is my output.

validation error 42.604167 %
top 5 validation error 19.863782 %
validation loss 1.842940

Then I tried your method adding some extra outputs showing the loss, top-1 error, top-5 error, y_pred and ground truth of the first hkl file "val_hkl_b256_b_256/0000.hkl". And still didn't reproduce your problem. Here is my output:

img_mean received
weight loaded: W0_1_65
weight loaded: W1_1_65
weight loaded: b0_1_65
weight loaded: b1_1_65
weight loaded: W_2_65
weight loaded: b_2_65
weight loaded: W0_3_65
weight loaded: W1_3_65
weight loaded: b0_3_65
weight loaded: b1_3_65
weight loaded: W0_4_65
weight loaded: W1_4_65
weight loaded: b0_4_65
weight loaded: b1_4_65
weight loaded: W_5_65
weight loaded: b_5_65
weight loaded: W_6_65
weight loaded: b_6_65
weight loaded: W_7_65
weight loaded: b_7_65

 # loss of  0000.hkl
1.80309534073 

 # top-1 error of 0000.hkl
0.41015625 

 # top-5 error of 0000.hkl
0.19140625

 # prediction of 0000.hkl
[ 50 795 230 809 520  67 334 911  12 850 109 286 370 757 595 147 857  21
 526 517 334 209 948 727  23 827 270 166  64 448 324 573 360 879 586 887
 114 886 777 321 431 756 129 196 256 613 565 162 468 824  91  29 844 591
 358 468 259 994 840 588 490 206 107 317 842 390 101 887 870 837 693 149
  21 476  80 424 159 275 175 461 970 160 642  25 817 498 375 123 761  47
 270 384 366 484 373 705 331 142 949 336 473 159 872 878 201 971  70 889
 632 411 470 951 227 758 161 959 638 646 722 645 476 483 852 397  94 650
 352  21 934 283 802 534 276 164 751 363 610 328 969 608 515 328 771 726
 977 875 266 535 590 977 918 637  39 115 945 274 277 763 905 646 213 894
 647 504 937 687 781 666 583 171 825 212 659 257 436 199 140 248 339 232
 239 544 961 445 656 289 867 272 103 543 243 450 449 771 122 396 438  16
 548 993 466 790 233 819 605 376 330 606 922 431 284 889 475 701 475 984
  16  77 610 197 636 662 587 213  25 427 215 235  35 741 125 812 289 425
 973 393 167 121 876 422 532 298 678 819 124 349  13 179 696 894 989 455
 647 660 983 533]

 # ground truth of 0000.hkl
[ 65 970 230 809 516  57 334 415 674 332 109 286 370 757 595 147 108  23
 478 517 334 173 948 727  23 846 270 167  55 858 324 573 150 981 586 887
  32 398 777  74 516 756 129 198 256 725 565 167 717 394  92  29 844 591
 358 468 259 994 872 588 474 183 107  46 842 390 101 887 870 841 467 149
  21 476  80 424 159 275 175 461 970 160 788  58 479 498 369  28 487  50
 270 383 366 780 373 705 330 142 949 349 473 159 872 878 201 906  70 486
 632 608 122 720 227 686 173 959 638 646 664 645 718 483 852 392 311 457
 352  22 934 283 802 553 276 236 751 343 528 328 969 558 163 328 771 726
 977 875 265 686 590 975 620 637  39 115 937 272 277 763 789 646 213 493
 647 504 937 687 781 666 583 158 825 212 659 257 436 196 140 248 339 230
 361 544 935 638 627 289 867 272 103 584 180 703 449 771 118 396 934  16
 548 993 704 457 233 401 827 376 146 606 922 516 284 889 475 978 475 984
  16  77 610 254 636 662 473 213  25 463 215 173  35 741 125 787 289 425
 973   1 167 121 445 702 532 366 678 764 125 349  13 179 522 493 989 720
 438 660 983 533]

Then I guess the problem might be either your 0000.hkl file or the img_mean.npy file. So I print part of the subtracted numpy array content here just for you to check:

data =  hkl.load(str(val_filenames[0])) - img_mean   # note that this numpy array is in c01b shape
print data[0,0,:,0] 

[ 60.81474304  59.98262024  61.70704651  63.47245026  65.19876099
  66.94126892  67.70253754  68.4881897   70.23377228  68.05161285
  67.82019806  69.6297226   71.41252136  70.22473145  70.99832153
  70.79641724  70.55299377  70.407547    72.21337891  72.02416992
  71.82472992  70.67632294  70.50945282  69.3549881   70.14443207
  68.99860382  71.87391663  68.70227051  70.54374695  69.39862823
  70.24663544  72.12097168  70.77048492  70.84523773  71.70626831
  70.52833557  72.38629913  71.27055359  72.13372803  75.01039886
  71.87496948  69.7983551   69.66268158  68.5298233   71.40841675
  71.26996613  73.15507507  72.05388641  69.9105835   69.88194275
  69.73725891  67.62825775  68.53401947  68.44831085  69.35622406
  67.24156189  67.13644409  68.05505371  66.96246338  70.84378052
  71.72956085  69.63776398  68.54032135  67.44572449  69.22187042
  67.31957245  69.21909332  67.13356018  69.07128906  68.98921204
  69.8862381   67.82107544  64.7012558   67.6654129   67.6029892
  68.5490036   63.47123718  65.42095184  68.32015991  66.23458862
  69.08278656  71.08676147  68.04005432  68.97113037  68.87669373
  68.87887573  68.83561707  68.7844696   66.65924072  65.64706421
  70.6214447   68.60018921  67.52011108  68.4826355   71.4407196
  68.41194153  69.12646484  67.32783508  67.28132629  68.2490387   70.198349
  69.186203    69.13742065  68.11381531  69.05073547  69.03283691
  68.99168396  68.95507812  70.88711548  68.87696838  69.85441589
  69.85762024  69.80123901  70.83694458  74.81047058  70.78660583
  71.76083374  68.73417664  68.72711182  69.69822693  68.71060181
  68.73425293  67.69438171  71.72309875  70.69856262  68.70462036
  69.68235779  69.71430969  71.51199341  69.71516418  70.72076416
  70.73500061  69.7416687   70.74057007  69.73829651  69.77235413
  69.75363159  68.80198669  69.80335999  72.81376648  71.80427551
  71.82643127  69.85496521  69.87399292  69.83905029  70.90715027
  69.94906616  70.94746399  71.95011902  68.99838257  69.02256775
  69.07652283  72.0579071   72.08851624  67.11096191  68.16065979
  67.19436646  67.24224854  69.30108643  72.34490967  68.13990784
  69.42300415  69.46040344  68.50202942  70.54467773  68.58229065
  69.61657715  67.69314575  68.69284058  66.78651428  66.83294678
  67.88116455  68.9238739   68.00258636  68.06608582  71.12767792
  69.15472412  68.27887726  70.36660767  69.45388794  70.47919464
  69.53961945  68.61686707  68.69607544  67.74487305  67.85128021
  66.90164185  63.97473145  64.03827667  67.15988159  68.24112701
  67.32301331  70.26672363  70.49568939  70.57806396  68.64303589
  67.73782349  72.82657623  70.9177475   70.00962067  68.12471008
  67.22493744  67.33610535  68.41151428  69.51146698  66.65380096
  69.74698639  70.84283447  67.89214325  68.06248474  67.18651581
  68.29170227  68.38796997  67.51506042  69.64525604  67.76802063
  68.8739624   68.02518463  67.13772583  66.23940277  67.36875916
  64.49555206  65.62204742  65.76167297  66.65933228  69.03590393
  65.16316986  69.3156662   69.4850235   68.64757538  69.79444885
  67.94555664  71.08917999  70.28927612  69.46868896  70.62604523
  69.79985809  71.98527527  69.17094421  68.38463593  71.49598694
  69.76255798  70.95394135  72.15198517  72.33311462  72.52670288
  69.7458725   70.94896698  72.11769104  73.38235474  72.59131622
  70.80739594  70.96855927  71.25843811  69.4681778   67.69448853]

Please let me know your checking result to help us find where the bug might be.

mesnilgr commented 8 years ago

ok I got the same input now for:

data =  hkl.load(str(val_filenames[0])) - img_mean

I found my problem was that I ran the python make_hkl.py on the ImageNet valid set from 2010. (BTW, I think you rescale the image w/o respecting the aspect ratio VS rescaling the image so the smallest side is 256 and then center crop).

No bugs on your side. Thanks for your help!

Now I obtain the following running it on the whole validation set:

validation error 44.226763 %
top 5 validation error 20.787260 %
validation loss 1.910074

which is to a few percents what you obtain.

hma02 commented 8 years ago

@mesnilgr no problem. Thanks for trying out our code. For your debugging convenience, I prepared a show_batch tool here https://github.com/hma02/show_batch .

It can be used to visualize batch images and their corresponding word description.

mesnilgr commented 8 years ago

@gwtaylor Thanks! very useful.

uoguelph-mlrg / theano_alexnet

can't reproduce results with pre-trained models #19