D3Mlab / diffu-detox

5 stars 1 forks source link

Problems of mixed cond and uncond training #1

Open bansky-cl opened 11 months ago

bansky-cl commented 11 months ago

Hi, thanks for your great work.

I have some questions about the code when training mixture of unconditional and conditional model.

In my opinion, the self.cf_ratio below shows the probability of unconditional model, and 1 - self.cf_ratio is conditional model. (a) Is this 1 - self.cf_ratio the $\varphi$ = 0.8 mentioned in the paper ?

https://github.com/D3Mlab/diffu-detox/blob/a4eaefca35806fb4ffd369534a089239442596c7/train_util.py#L211-L215

In the micro-batch, I confused that "the unconditional model is trained using the non-toxic sentences sampled from the ParaDetox dataset and the additional dataset with equal probabilities." mentioned in the paper. (b) The second self.cf_ratio confuses me and I wander where the code shows equal probabilities? Please correct me if I'm wrong.

https://github.com/D3Mlab/diffu-detox/blob/a4eaefca35806fb4ffd369534a089239442596c7/train_util.py#L265

https://github.com/D3Mlab/diffu-detox/blob/a4eaefca35806fb4ffd369534a089239442596c7/train_util.py#L297

(c) The model seems not use the input_embs in the batch and compute losses, it only use the net.get_emb(input_ids) to get input_embs. So I think the update embedding part can be removed.

https://github.com/D3Mlab/diffu-detox/blob/a4eaefca35806fb4ffd369534a089239442596c7/train_util.py#L272-L276

cecilialeo77 commented 2 weeks ago

Hello, have you successfully reproduced the paper results? The results after my training are far from what was reported in the original paper

bansky-cl commented 1 week ago

Hello, have you successfully reproduced the paper results? The results after my training are far from what was reported in the original paper

Sorry, I am not doing a detoxification task, so I don’t care much about the experimental results (the code can run). I am more concerned about how this work is achieved in controllable generation, but the code is a bit confusing to me (as mentioned above), so I did not refer to this approach in the end.