pmorerio / minimal-entropy-correlation-alignment

Code for the paper "Minimal-Entropy Correlation Alignment for Unsupervised Deep Domain Adaptation", ICLR 2018
MIT License
63 stars 17 forks source link

Accuracy is lower than the paper. #1

Closed deep0learning closed 6 years ago

deep0learning commented 6 years ago

Hi ! I have re-run your code and got 90% accuracy for both log-d-coral and d-coral. Is there any missing in the code ? I just use following command:
for log-d-coral: python main.py --mode='train' --method='log-d-coral' --alpha=1. --device='/gpu:0' python main.py --mode='test' --method='log-d-coral' --alpha=1. --device='/gpu:0'

for d-coral: python main.py --mode='train' --method='d-coral' --alpha=1. --device='/gpu:0' python main.py --mode='test' --method='d-coral' --alpha=1. --device='/gpu:0'

pmorerio commented 6 years ago

Hi. As we argue in the paper, the hyper-parameter alpha should be validated. Since there is no validation set in unsupervised domain adaptation, one cannot simply choose the alpha that gives better results on the target set. Instead, we show that we can safely choose the alpha which minimizes entropy. You will notice that entropy is also printed together with accuracy in the testing function. Fig. 2 in the paper shows you this phenomenon and gives you a hint on the value of alpha (named lambda in the paper), which should be around 7 for log-D-Coral.

redhat12345 commented 6 years ago

Thank you so much.

pmorerio commented 6 years ago

@redhat12345 You are welcome. Actually I just run the code with --alpha=7. and got test acc [0.969], which is even higher than the results reported in the paper. But of course this is not the way to validate the result. One should run for different alpha and find the minimum of the entropy.

redhat12345 commented 6 years ago

In your code you are using the following architecture. Is it LeNet architecture? What is the dimension of fc4 and fc5? Can you explain a little bit more please? Is the dimension of fc3 n*1024?

` def E(self, images, is_training = False, reuse=False):

if images.get_shape()[3] == 3:
    images = tf.image.rgb_to_grayscale(images)

with tf.variable_scope('encoder',reuse=reuse):
    with slim.arg_scope([slim.fully_connected], activation_fn=tf.nn.relu):
    with slim.arg_scope([slim.conv2d], activation_fn=tf.nn.relu, padding='VALID'):
        net = slim.conv2d(images, 64, 5, scope='conv1')
        net = slim.max_pool2d(net, 2, stride=2, scope='pool1')
        net = slim.conv2d(net, 128, 5, scope='conv2')
        net = slim.max_pool2d(net, 2, stride=2, scope='pool2')
        net = tf.contrib.layers.flatten(net)
        net = slim.fully_connected(net, 1024, activation_fn=tf.nn.relu, scope='fc3')
        net = slim.dropout(net, 0.5, is_training=is_training)
        net = slim.fully_connected(net, self.hidden_repr_size, activation_fn=tf.tanh,scope='fc4')
        # dropout here or not?
        #~ net = slim.dropout(net, 0.5, is_training=is_training)

return net `

pmorerio commented 6 years ago

Hi, you can find the details in the paper, appendix D.2. If you need more details please open a dedicated issue.

redhat12345 commented 6 years ago

I got it. Thank you so much again.