Refactoring to reproduce paper results

marcociccone commented 4 years ago

Hi, first of all congrats for your work and thanks for sharing the code :) I've read the paper and checked the code and I found few inconsistencies, so I thought it would be good to refactor your codebase to have a better understanding of all the pieces. I am going to ask you a few questions that I hope will help me and other researchers in building upon your great work.

I've used the DigitFive dataset you provided. From my understanding, the code requires to resize the images from svhn and and syn domains with the resize_from32x32to28x28.m script. Unfortunately, I couldn't find the test and train split for the syn domain in the zip file, thus I wasn't able to generate the data. I could use your help here. To be more specific, I'm referring to files synth_train28x28.mat, synth_test28x28.mat.
There was some confusion about the role of C_0, C_1, and C_2 in the code. Indeed, the paper only mentions two classifiers. I still don't catch the need of the three classifiers, could you please clarify this point.
Reading the paper it seems that the enumeration has the following meaning attached. Could you confirm this?
- C_0, D_0 --> domain-specific (ds)
- C_1, D_1 --> domain-independent (di)
- C_2, D_2 --> class-independent (ci)
Also, I can't find any mention of the discrepancy losses in the paper. Could you clarify this point too?
Apart from implementation questions, I obtain nan loss function pretty soon. I will play around with the hyper-parameters. I had to remove the syndataset from now, maybe the hyper-params you used need to be adjusted to take into account this change.

Thank you for your help!

marcociccone commented 4 years ago

Hi @xcpeng , I'm not sure you'd want to merge my PR into your master. I'm still debugging it. I've found a couple of bugs, for sure the Mutual Information minimization was wrong. Do you have time to check the correctness of the implementation with me?

marcociccone commented 4 years ago

Few more details: I believe the code was minimizing the MI between (di, ci) and (ds, ci) instead of (di, ci) and (di, ds) as described in the paper. Fixing this here I'm not having the nan loss anymore, but the training is still very unstable.

I've integrated tensorboard so you can easily visualize the loss functions. Let me know what you think!

YiDongOuYang commented 4 years ago

@marcociccone Thank you very much! I don't think that paper is a qualified work since the author cannot explain the results and the misleading code. But I would like to appreciate for your efforts.

VisionLearningGroup / DAL

Refactoring to reproduce paper results #4