VisionLearningGroup / DAL

Domain agnostic learning with disentangled representations
143 stars 28 forks source link

What does D0, D1 and D2 (C0,C1and C2) represent here? #2

Open cwpeng-cn opened 5 years ago

cwpeng-cn commented 5 years ago

Thanks for sharing your code! I have some doubts and hope to communicate with you.

  1. The paper contains only two class predictors. Why are there three class predictors C0,C1, and C2 in the code?

  2. What does D0,D1, and D2 represent?The code calculate max mutual information between (d0, d2) and (d1, d2), That means d2 is the domain-invariant feature and D2 is the domain-invariant disentangler (paper section 3). image But this code also does adversarial alignment on D1, which means that D1 is the domain-invariant disentangler. image

  3. What does this code calculate? I couldn't find an explanation in the paper. image

Looking forward to your reply. Thank you!

zorrocai commented 4 years ago

What is more, why are there three disentanglers? Are not domain-invariant, domain-specific and class-irrelevant features disentangled from one distentangler as illustrated in Figure 1.?

buerzlh commented 4 years ago

What is more, why are there three disentanglers? Are not domain-invariant, domain-specific and class-irrelevant features disentangled from one distentangler as illustrated in Figure 1.?

I am very sorry to bother you, I am also reading this code, since there is no data set, I used my own data set to experiment, but the experimental results are not good, so I would like to ask if you have the data set of this code.

zorrocai commented 4 years ago

Digit-Five may need to be collected by yourself. Office-Caltech10:https://people.eecs.berkeley.edu/~jhoffman/domainadapt/ DomainNet:http://ai.bu.edu/M3SDA/

buerzlh commented 4 years ago

Digit-Five may need to be collected by yourself. Office-Caltech10:https://people.eecs.berkeley.edu/~jhoffman/domainadapt/ DomainNet:http://ai.bu.edu/M3SDA/

Thank you very much for your data set. I still have a question. Have you read this code carefully? I think this code is different from the paper in some places, like D0, D1, D2, C0, C1, C2 training, and The calculation of ringloss, etc., I don't know if it is a problem with the code or I don't understand the code, I hope to get your guidance. Thank you

ne-bo commented 4 years ago

Yes, looks like the code doesn't match the paper

marcociccone commented 4 years ago

Hi, if anyone is interested, I am trying to reproduce the results in my fork, starting from the original repo. Any help would be great! So far I'm still struggling, I described a bit the status here #4. I hope the authors will answer and help us out!

mmderakhshani commented 4 years ago

Hi @marcociccone. Were you able to reproduce the paper results?

marcociccone commented 4 years ago

@mmderakhshani No, Unfortunately I gave up, since the paper is not clear and far from being reproducible.

baiwenjia commented 4 years ago

I was also feeling a bit confused reading the code in several parts. For example, in the adversarial_alignment() function, it was supposed to train a GAN in a minimax fashion, i.e. first optimising G and then optimising D. But the code optimised G and D together (code: self.group_opt_step(['FD', 'D_di', 'G'])), which would just push the parameters randomly forward and backward. Then, there was a step to minimise the discrepancy between domain specific features and domain invariant features (code: loss_dis = _discrepancy(...)). This seems to be contradictory to what the paper proposed, i.e. domain specific features and domain invariant features should be far away from each other instead of similar.

def adversarial_alignment(self, img_src, img_trg):

        # FD should guess if the features extracted f_di = DI(G(im))
        # are from target or source domain. To win this game and fool FD,
        # DI should extract domain invariant features.

        # Loss measures features' ability to fool the discriminator
        src_domain_pred = self.FD(self.D['di'](self.G(img_src)))
        tgt_domain_pred = self.FD(self.D['di'](self.G(img_trg)))
        df_loss_src = self.adv_loss(src_domain_pred, self.src_domain_code)
        df_loss_trg = self.adv_loss(tgt_domain_pred, self.trg_domain_code)
        alignment_loss1 = 0.01 * (df_loss_src + df_loss_trg)
        alignment_loss1.backward()
        self.group_opt_step(['FD', 'D_di', 'G'])

        # Measure discriminator's ability to classify source from target samples
        src_domain_pred = self.FD(self.D['di'](self.G(img_src)))
        tgt_domain_pred = self.FD(self.D['di'](self.G(img_trg)))
        df_loss_src = self.adv_loss(src_domain_pred, 1 - self.src_domain_code)
        df_loss_trg = self.adv_loss(tgt_domain_pred, 1 - self.trg_domain_code)
        alignment_loss2 = 0.01 * (df_loss_src + df_loss_trg)
        alignment_loss2.backward()
        self.group_opt_step(['FD', 'D_di', 'G'])

        for _ in range(self.num_k):
            loss_dis = _discrepancy(
                self.C['ds'](self.D['ds'](self.G(img_trg))),
                self.C['di'](self.D['di'](self.G(img_trg))))
            loss_dis.backward()
            self.group_opt_step(['G'])
        return alignment_loss1, alignment_loss2, loss_dis
baiwenjia commented 4 years ago

The mutual information neural estimator function (MINE) was also a bit confusing. The code implemented a multi-layer perceptron that outputs a scalar value, instead of a probability distribution that was needed for calculating mutual information. Maybe I need to read the original MINE paper to understand this.

class Mine(nn.Module):
    def __init__(self):
        super(Mine, self).__init__()
        self.fc1_x = nn.Linear(2048, 512)
        self.fc1_y = nn.Linear(2048, 512)
        self.fc2 = nn.Linear(512,1)
    def forward(self, x,y):
        h1 = F.leaky_relu(self.fc1_x(x)+self.fc1_y(y))
        h2 = self.fc2(h1)
        return h2
baiwenjia commented 4 years ago

Anyway, it is still an interesting paper about domain adaptation with a lot of inspirations.

kevinbro96 commented 4 years ago

optimised Yes, I found the same question with you. There is no adversarial training in the code! Since there are so many differences between code and paper and the author seems never reponse, I think the result of the paper is fake.

Z27033441 commented 3 years ago

Can I ask if anyone has reproduced the code for this paper ?

Z27033441 commented 3 years ago

What is more, why are there three disentanglers? Are not domain-invariant, domain-specific and class-irrelevant features disentangled from one distentangler as illustrated in Figure 1.?

because one disentangler can not disentangle three codes