cvlab-columbia / CT4Recognition

21 stars 8 forks source link

questions about implementation details #2

Open yuanhangtangle opened 1 year ago

yuanhangtangle commented 1 year ago

Thank you for your repo. I meet with some problems reading the codes and the paper.

Q1. In fd_waterbird.py, I can't find the code coresponding to $P(R|X)$. It seems that you are simply dropping some elements with torch.distributions.binomial.Binomial rather than actually modeling $P(R|X)$. Then what is $P(R|X)$ for?

Q2. The codes following the drop mentioned in Q1 are quite confusing:

for jj in range(args.samples - 1):
      if add_n:
          binomial = torch.distributions.binomial.Binomial(
              probs=1 - p)
          fea = feature * binomial.sample(
              feature.size()).cuda() * (1.0 / (1 - p))
      else:
          fea = feature
      logit_compose = logit_compose + classifier(
          fea, Xp[j * bs_m:(j + 1) * bs_m, :, :, :])  # TODO:

I seems that logit_compose = logit_compose + classifier(fea, Xp[j * bs_m:(j + 1) * bs_m, :, :, :]) is run for multiple times without any modification.

Q3. I can't find the variables representing Ni and Nj described in the paper (5.3.Experimental Settings: We set Nj = 256 and Ni = 10 for all experiments and denotes it as Ours). I seems that codes mentioned in Q1 and A2 are the key to the implementation, but I don't know how they relate to the front-door formula derived in the paper.

Expecting your reply. Thanks in advance.

ChengzhiCU commented 1 year ago

Thank you for your questions.

Q1: P(R|X) is a neural network. It can be pretrained VAE, then R is the latent space in VAE. For ResNet, R is the second to last layers feature. Since R is deterministic, we add random noise to it to make it ``probabilistic'', yet in ResNet, there is no probabilistic guarantee. VAE is probabilistic modeling, so that R is precise.

Q2, Q3: j is changing, which corresponding to N_j. jj is to perform the Monte Carlo sampling on P(R|X), which is N_i. Marginalizing over j (N_j) is important, which is a key factor in the front-door.

Let me know if you have other questions.