SSARCandy / DeepCORAL

🧠 A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation.', ECCV 2016
https://ssarcandy.tw/2017/10/31/deep-coral/
226 stars 42 forks source link

Test accuracy is low than original test accuracy in the paper. #2

Closed redhat12345 closed 6 years ago

redhat12345 commented 6 years ago

Test accuracy is low than original test accuracy in the paper. Can you fix it please?

SSARCandy commented 6 years ago

That's true, i will try to improve it

redhat12345 commented 6 years ago

In original implementation, they used pre-trained model. Did you use pretrained model ?

SSARCandy commented 6 years ago

I did use ImageNet pre-trained model, please see https://github.com/SSARCandy/DeepCORAL/blob/master/main.py#L147

redhat12345 commented 6 years ago

Yes.. Sorry for the comment without noticing. Thank you for your kind reply. I did not find the reason of low accuracy?

SSARCandy commented 6 years ago

Although the accuracy is lower than paper's claimed, I think the difference (improvement) between w/ CORAL and w/o CORAL is still noticeable.

But the low accuracy is still a problem, I'll try to find out why.

redhat12345 commented 6 years ago

Yes...absolutely.

SSARCandy commented 6 years ago

I think we can narrow down the accuracy issue to the w/o CORAL loss test task, since even w/o CORAL loss, the accuracy on test case didn't as good as paper claimed. (my: ~47%, paper: ~55%)

I didn't find out why😞, @redhat12345 could you help with this issue..? thanks!

ZhangJUJU commented 6 years ago

these days I am tryingto find the problems,but I still not fix it yet.I will try to fix it.not I am trying to write an custom dataloader,I guess the problem caused by the pytorch dataloader.plz wait a few days..if I can fix it.

ZhangJUJU commented 6 years ago

Sorry ,I made a mistake.Today I try another method to shuffle the dateset,but failed...I'll continue try to fix the problems.

redhat12345 commented 6 years ago

@SSARCandy I think the problems comes from data preprocessing. The way pytorch process data is different than caffe. I am also trying to fix the problems.

ZhangJUJU commented 6 years ago

@redhat12345 you may rewrite the dataloader or otherwise....

redhat12345 commented 6 years ago

@SSARCandy

I will writedown dataloder script and update you.

redhat12345 commented 6 years ago

Did you try in the following way? (http://pytorch.org/docs/master/notes/autograd.html)

model = torchvision.models.alexnet(pretrained=True) for param in model.parameters(): param.requires_grad = False model.fc = nn.Linear(4096, 31)

ZhangJUJU commented 6 years ago

@redhat12345 @SSARCandy In my view,Change the net structure from Alexnet to Resnet18 or other,is not a good way to solve the low accuracy,becauce the algrithm yours compaire with the mainstream algorithms always use Alexnet as the net struct,in paper,the author compaire his algritm with the oters using Alexnet and GoogLeNet,all the algrithm in GoogleNet has better performance than Alexnet.So we should continue to find the problem in pytorch's training step or dataloader.

SSARCandy commented 6 years ago

@ZhangJUJU I agree.

And in the paper they use AlexNet, so when re-implement their algorithm it should use the same structure.

ZhangJUJU commented 6 years ago

Yes,Redhat12345 suggest you to use resnet18,I don’t think it is good,a mount of researchers uses Alexnet because it has three fc layers,and it’s easy to train,resnet18 has 18 conv layers ,and only one fc layer,so if you do not use any DA method ,only fine-tuning the pre trained renet18 to adapt target domain data,it may get a better preference than some old DA algorithm on Alexnet e.g. DDC,DAN.I think the key problem is how to make deep CORAL method works well in Alexnet.

redhat12345 commented 6 years ago

Yes. I agreed with you. @SSARCandy @ZhangJUJU

jdily commented 6 years ago

I am curious if the low precision problem really came from data preprocessing as @redhat12345 and @ZhangJUJU said. As pointed in the original prototxt, it also used the mean from imagenet, which is also the case in this implementation.

debasmitdas commented 6 years ago

I think the better accuracy is due to an additional mean loss function they used in the Caffe prototxt:

layer { name: "mean_loss" type: "EuclideanLoss" bottom: "mean_source" bottom: "mean_target" top: "mean_loss" loss_weight: 0 include { phase: TRAIN } }

debasmitdas commented 6 years ago

Also, the Alexnet (one weird trick) used in this repository is different from the Alexnet (original) used by CORAL