The possible contradiction for disentangled cfg between the paper and train code

Hello authors, your work is impressive. Thanks for sharing the code base.

I want to clarify about your disentangled cfg. The paper mentions that you omitted the pose condition and the style condition with 0.1 probability. However, this code(train.py) seems to omit only the style condition. Since, invocation of unet in GaussianDiffusion.training_losses()

model_output = model(x = torch.cat([x_t, target_pose],1), t = self._scale_timesteps(t), x_cond = img, prob = prob)

passes both target_pose as concatenated input and img(style) as condition along with prob. Although the x_cond is masked with the probability given in the forward function of the unet BeatGANsAutoencModel.forward(), the argumentx is used without any modification.

Could you clarify how you train your model for disentangled cfg?

Excuse me if I overlooked something. Best regards.

ankanbhunia / PIDM

The possible contradiction for disentangled cfg between the paper and train code #44