erictzeng / adda

224 stars 76 forks source link

Running code out-of-the-box gives low accuracy #4

Open RobRomijnders opened 7 years ago

RobRomijnders commented 7 years ago

Running the code out of the box, I get 0.6523 accuracy. However, the paper notes a much higher accuracy. The current accuracy wouldn't even beat the competing models.

The output is here

emmanuelrouxfr commented 5 years ago

[EDIT 2018-12-18]

Hi,

I run the code out of the box (SVHN to MNIST) and got different behavior depending on the version of Tensorflow. I detail below the results i got when I run the code three times with v1.0.0 (the one used for the paper I guess) three times with the v1.4.0 (two others are running currently) and three times with v1.12.0.

v1.0.0 >>> OK v1.4.0 >>> NOT OK v1.12.0 >>> NOT OK

@erictzeng , do you have an idea of what could possibly be the reason of such a different behavior ?

With TensorFlow v1.0.0 everything is fine with 76.5%, 76.0% and 75.8% (except the source only performace in the third run which is a little higher than the error margin given in the paper 60.0 +- 1.1 %). Here are the detailed outputs:

RUN 1 in v1.0.0 environment

source only 0.605366666667

[2018-12-12 11:32:24,870] INFO Class accuracies: [2018-12-12 11:32:24,870] INFO 0.697 0.988 0.931 0.934 0.924 0.879 0.746 0.656 0.586 0.288 [2018-12-12 11:32:24,870] INFO Overall accuracy: [2018-12-12 11:32:24,870] INFO 0.764983333333

RUN 2 in v1.0.0 environment

source only 0.604383333333

[2018-12-12 13:51:36,650] INFO Class accuracies: [2018-12-12 13:51:36,651] INFO 0.542 0.977 0.979 0.950 0.898 0.813 0.696 0.648 0.652 0.416 [2018-12-12 13:51:36,651] INFO Overall accuracy: [2018-12-12 13:51:36,651] INFO 0.759566666667

RUN 3 in v1.0.0 environment

source only 0.643783333333

[2018-12-12 14:58:26,313] INFO Class accuracies: [2018-12-12 14:58:26,313] INFO 0.580 0.985 0.956 0.931 0.921 0.879 0.746 0.686 0.573 0.296 [2018-12-12 14:58:26,313] INFO Overall accuracy: [2018-12-12 14:58:26,313] INFO 0.757466666667

RUN 4 in v1.0.0 environment

source only 0.619366666667

[2018-12-18 11:37:20,316] INFO Class accuracies: [2018-12-18 11:37:20,317] INFO 0.574 0.981 0.978 0.913 0.940 0.880 0.653 0.641 0.686 0.509 [2018-12-18 11:37:20,317] INFO Overall accuracy: [2018-12-18 11:37:20,317] INFO 0.777066666667

RUN 5 in v1.0.0 environment

source only 0.617466666667

[2018-12-18 14:05:44,093] INFO Class accuracies: [2018-12-18 14:05:44,093] INFO 0.604 0.741 0.960 0.954 0.715 0.900 0.630 0.638 0.563 0.554 [2018-12-18 14:05:44,093] INFO Overall accuracy: [2018-12-18 14:05:44,093] INFO 0.725183333333

RUN 6 in v1.0.0 environment

source only 0.61145

[2018-12-19 11:19:29,713] INFO Class accuracies: [2018-12-19 11:19:29,713] INFO 0.560 0.538 0.887 0.930 0.836 0.924 0.719 0.660 0.761 0.010 [2018-12-19 11:19:29,713] INFO Overall accuracy: [2018-12-19 11:19:29,714] INFO 0.678783333333

______ With TensorFlow v1.4.0 and the stability of the results is already damaged: I got 65.5%, 69.5% and 69.3 % after domain adaptation. Here are the detailed outputs:

RUN 1 in v1.4.0 environment

source only 0.5983666666666667

[2018-12-13 17:10:25,714] INFO Class accuracies: [2018-12-13 17:10:25,714] INFO 0.289 0.972 0.939 0.927 0.560 0.899 0.569 0.624 0.716 0.027 [2018-12-13 17:10:25,714] INFO Overall accuracy: [2018-12-13 17:10:25,714] INFO 0.6552333333333333

RUN 2 in v1.4.0 environment

source only 0.6204333333333333

[2018-12-14 15:00:22,930] INFO Class accuracies: [2018-12-14 15:00:22,930] INFO 0.565 0.989 0.923 0.872 0.704 0.860 0.702 0.598 0.609 0.100 [2018-12-14 15:00:22,930] INFO Overall accuracy: [2018-12-14 15:00:22,930] INFO 0.6948333333333333

RUN 3 in v1.4.0 environment

source only 0.6344166666666666

[2018-12-14 15:34:51,809] INFO Class accuracies: [2018-12-14 15:34:51,809] INFO 0.560 0.635 0.949 0.915 0.551 0.918 0.628 0.693 0.603 0.492 [2018-12-14 15:34:51,809] INFO Overall accuracy: [2018-12-14 15:34:51,809] INFO 0.69285

______ Again with TensorFlow v1.12.0 it is not stable: I got 63.4%, 68.5% and 70.7 % after domain adaptation. Here are the detailed outputs:

RUN 1 in v1.12.0 environment

source only 0.60275

[2018-12-10 16:43:19,156] INFO Class accuracies: [2018-12-10 16:43:19,156] INFO 0.279 0.675 0.971 0.913 0.599 0.893 0.536 0.647 0.643 0.187 [2018-12-10 16:43:19,156] INFO Overall accuracy: [2018-12-10 16:43:19,156] INFO 0.6340666666666667

RUN 2 in v1.12.0 environment

source only 0.5934

[2018-12-11 11:15:20,546] INFO Class accuracies: [2018-12-11 11:15:20,546] INFO 0.592 0.980 0.958 0.916 0.489 0.928 0.673 0.724 0.548 0.011 [2018-12-11 11:15:20,547] INFO Overall accuracy: [2018-12-11 11:15:20,547] INFO 0.6851833333333334

RUN 3 in v1.12.0 environment

source only 0.6303833333333333

[2018-12-11 12:13:15,759] INFO Class accuracies: [2018-12-11 12:13:15,759] INFO 0.580 0.988 0.943 0.896 0.586 0.859 0.553 0.674 0.612 0.349 [2018-12-11 12:13:15,759] INFO Overall accuracy: [2018-12-11 12:13:15,759] INFO 0.7073333333333334

wheatdog commented 5 years ago

This is shocking while interesting. I am wondering what is the highest version that ADDA would work properly. Maybe we can start investigating from there.

wheatdog commented 5 years ago

This is what I get.

tensorflow 1.0.1

RUN 1

Source only baseline:
hdf5 is not supported on this machine (please install/reinstall h5py for optimal experience)
[2018-12-18 09:53:16,621] INFO     Using GPU 0
[2018-12-18 09:53:17,009] INFO     Resizing images to [28, 28]
[2018-12-18 09:53:18,022] INFO     Evaluating snapshot/lenet_svhn/lenet_svhn-10000
[2018-12-18 09:55:44,611] INFO     Class accuracies:
[2018-12-18 09:55:44,611] INFO         0.714  0.861  0.645  0.819  0.742  0.898  0.295  0.757  0.568  0.272
[2018-12-18 09:55:44,611] INFO     Overall accuracy:
[2018-12-18 09:55:44,611] INFO         0.65885
ADDA:
hdf5 is not supported on this machine (please install/reinstall h5py for optimal experience)
[2018-12-18 09:55:46,471] INFO     Using GPU 0
[2018-12-18 09:55:46,920] INFO     Resizing images to [28, 28]
[2018-12-18 09:56:39,221] INFO     Evaluating snapshot/adda_lenet_svhn_mnist/adda_lenet_svhn_mnist-10000
[2018-12-18 09:58:04,955] INFO     Class accuracies:
[2018-12-18 09:58:04,955] INFO         0.593  0.571  0.979  0.887  0.721  0.917  0.270  0.700  0.633  0.557
[2018-12-18 09:58:04,955] INFO     Overall accuracy:
[2018-12-18 09:58:04,956] INFO         0.6803333333333333

RUN 2

Source only baseline:
hdf5 is not supported on this machine (please install/reinstall h5py for optimal experience)
[2018-12-18 09:55:58,899] INFO     Using GPU 0
[2018-12-18 09:55:59,350] INFO     Resizing images to [28, 28]
[2018-12-18 09:56:35,349] INFO     Evaluating snapshot/lenet_svhn/lenet_svhn-10000
[2018-12-18 09:58:06,222] INFO     Class accuracies:
[2018-12-18 09:58:06,222] INFO         0.722  0.662  0.579  0.809  0.817  0.834  0.424  0.743  0.615  0.243
[2018-12-18 09:58:06,222] INFO     Overall accuracy:
[2018-12-18 09:58:06,222] INFO         0.6443666666666666
ADDA:
hdf5 is not supported on this machine (please install/reinstall h5py for optimal experience)
[2018-12-18 09:58:08,205] INFO     Using GPU 0
[2018-12-18 09:58:09,313] INFO     Resizing images to [28, 28]
[2018-12-18 09:58:36,527] INFO     Evaluating snapshot/adda_lenet_svhn_mnist/adda_lenet_svhn_mnist-10000
[2018-12-18 10:01:05,300] INFO     Class accuracies:
[2018-12-18 10:01:05,300] INFO         0.630  0.982  0.942  0.873  0.694  0.925  0.580  0.701  0.670  0.521
[2018-12-18 10:01:05,300] INFO     Overall accuracy:
[2018-12-18 10:01:05,300] INFO         0.7539

RUN 3

Source only baseline:
hdf5 is not supported on this machine (please install/reinstall h5py for optimal experience)
[2018-12-18 10:01:06,622] INFO     Using GPU 0
[2018-12-18 10:01:06,982] INFO     Resizing images to [28, 28]
[2018-12-18 10:01:08,043] INFO     Evaluating snapshot/lenet_svhn/lenet_svhn-10000
[2018-12-18 10:03:08,492] INFO     Class accuracies:
[2018-12-18 10:03:08,492] INFO         0.432  0.824  0.636  0.758  0.750  0.623  0.400  0.701  0.596  0.271
[2018-12-18 10:03:08,492] INFO     Overall accuracy:
[2018-12-18 10:03:08,493] INFO         0.6027
ADDA:
hdf5 is not supported on this machine (please install/reinstall h5py for optimal experience)
[2018-12-18 10:03:10,439] INFO     Using GPU 0
[2018-12-18 10:03:10,905] INFO     Resizing images to [28, 28]
[2018-12-18 10:04:14,689] INFO     Evaluating snapshot/adda_lenet_svhn_mnist/adda_lenet_svhn_mnist-10000
[2018-12-18 10:06:03,825] INFO     Class accuracies:
[2018-12-18 10:06:03,826] INFO         0.620  0.994  0.962  0.854  0.748  0.910  0.535  0.680  0.652  0.559
[2018-12-18 10:06:03,826] INFO     Overall accuracy:
[2018-12-18 10:06:03,826] INFO         0.7535666666666667

tensorflow 1.2.1

RUN 1

[2018-12-17 23:54:51,367] INFO     Class accuracies:
[2018-12-17 23:54:51,368] INFO         0.591  0.748  0.966  0.942  0.903  0.888  0.795  0.615  0.335  0.519
[2018-12-17 23:54:51,368] INFO     Overall accuracy:
[2018-12-17 23:54:51,368] INFO         0.7296

RUN 2

Source only baseline:
hdf5 is not supported on this machine (please install/reinstall h5py for optimal experience)
[2018-12-18 00:28:08,490] INFO     Using GPU 0
[2018-12-18 00:28:08,847] INFO     Resizing images to [28, 28]
[2018-12-18 00:28:09,642] INFO     Evaluating snapshot/lenet_svhn/lenet_svhn-10000
[2018-12-18 00:28:09,643] INFO     Restoring parameters from snapshot/lenet_svhn/lenet_svhn-10000
[2018-12-18 00:29:14,156] INFO     Class accuracies:
[2018-12-18 00:29:14,157] INFO         0.469  0.903  0.600  0.853  0.888  0.735  0.359  0.754  0.327  0.318
[2018-12-18 00:29:14,157] INFO     Overall accuracy:
[2018-12-18 00:29:14,157] INFO         0.6249166666666667
ADDA:
hdf5 is not supported on this machine (please install/reinstall h5py for optimal experience)
[2018-12-18 00:29:20,654] INFO     Using GPU 0
[2018-12-18 00:29:21,023] INFO     Resizing images to [28, 28]
[2018-12-18 00:29:21,750] INFO     Evaluating snapshot/adda_lenet_svhn_mnist/adda_lenet_svhn_mnist-10000
[2018-12-18 00:29:21,750] INFO     Restoring parameters from snapshot/adda_lenet_svhn_mnist/adda_lenet_svhn_mnist-10000
[2018-12-18 00:30:32,192] INFO     Class accuracies:
[2018-12-18 00:30:32,192] INFO         0.561  0.986  0.967  0.938  0.690  0.888  0.605  0.670  0.635  0.302
[2018-12-18 00:30:32,193] INFO     Overall accuracy:
[2018-12-18 00:30:32,193] INFO         0.7270666666666666

RUN 3

Source only baseline:
hdf5 is not supported on this machine (please install/reinstall h5py for optimal experience)
[2018-12-18 00:56:15,868] INFO     Using GPU 0
[2018-12-18 00:56:16,777] INFO     Resizing images to [28, 28]
[2018-12-18 00:56:17,506] INFO     Evaluating snapshot/lenet_svhn/lenet_svhn-10000
[2018-12-18 00:56:17,506] INFO     Restoring parameters from snapshot/lenet_svhn/lenet_svhn-10000
[2018-12-18 00:57:38,637] INFO     Class accuracies:
[2018-12-18 00:57:38,638] INFO         0.564  0.908  0.574  0.830  0.600  0.818  0.342  0.720  0.546  0.285
[2018-12-18 00:57:38,638] INFO     Overall accuracy:
[2018-12-18 00:57:38,638] INFO         0.6221833333333333
ADDA:
hdf5 is not supported on this machine (please install/reinstall h5py for optimal experience)
[2018-12-18 00:57:45,369] INFO     Using GPU 0
[2018-12-18 00:57:46,530] INFO     Resizing images to [28, 28]
[2018-12-18 00:57:47,235] INFO     Evaluating snapshot/adda_lenet_svhn_mnist/adda_lenet_svhn_mnist-10000
[2018-12-18 00:57:47,235] INFO     Restoring parameters from snapshot/adda_lenet_svhn_mnist/adda_lenet_svhn_mnist-10000
[2018-12-18 00:59:11,632] INFO     Class accuracies:
[2018-12-18 00:59:11,633] INFO         0.627  0.981  0.978  0.874  0.879  0.908  0.817  0.629  0.692  0.607
[2018-12-18 00:59:11,633] INFO     Overall accuracy:
[2018-12-18 00:59:11,633] INFO         0.8000166666666667
emmanuelrouxfr commented 5 years ago

Thanks @wheatdog for your quick tests with those two versions of Tensorflow (v1.0.1., v1.2.1). Apparently there is already a drop of performance in the first run of v1.0.1 (68.0 %). But considering that the two other runs of v1.0.1 (75.4 % twice) are consistent with the paper results it is hard to conclude that the code was broken in the change between v1.0.0 and v1.0.1... I will run again several times on v1.0.0 to check if such a drop happens.

On the contrary the tests you provided with v1.2.1 show that from v1.0.1 something has strongly changed (73% and 80 % ...)!

[EDIT 2018-12-19] Indeed a drop happenned in run 6 of v1.0.0 (here: https://github.com/erictzeng/adda/issues/4#issuecomment-445869923). So it is probably not in between v1.0.0 and v1.0.1 that the code is broken ..

wheatdog commented 5 years ago

Thanks @emmanuelrouxfr. I also want to know whether ADDA actually behave more consistently in v1.0.0.

BTW, I install different version of tensorflow using conda like this.

#!/bin/bash

TF_VERSION=1.0.1

cp -r adda adda-$TF_VERSION
cd adda-$TF_VERSION

conda create -n adda-$TF_VERSION python=3.6 tensorflow-gpu=$TF_VERSION -y

source activate adda-$TF_VERSION

pip install -r <(tail -n7 requirements.txt )
export PYTHONPATH="$PWD:$PYTHONPATH"
scripts/svhn-mnist.sh

echo "Using $TF_VERSION"