Open bansky-cl opened 1 year ago
Hello, have you successfully reproduced the paper results? The results after my training are far from what was reported in the original paper
Hello, have you successfully reproduced the paper results? The results after my training are far from what was reported in the original paper
Sorry, I am not doing a detoxification task, so I don’t care much about the experimental results (the code can run). I am more concerned about how this work is achieved in controllable generation, but the code is a bit confusing to me (as mentioned above), so I did not refer to this approach in the end.
Hi, thanks for your great work.
I have some questions about the code when training mixture of unconditional and conditional model.
In my opinion, the
self.cf_ratio
below shows the probability of unconditional model, and1 - self.cf_ratio
is conditional model. (a) Is this1 - self.cf_ratio
the $\varphi$ = 0.8 mentioned in the paper ?https://github.com/D3Mlab/diffu-detox/blob/a4eaefca35806fb4ffd369534a089239442596c7/train_util.py#L211-L215
In the micro-batch, I confused that "the unconditional model is trained using the non-toxic sentences sampled from the ParaDetox dataset and the additional dataset with equal probabilities." mentioned in the paper. (b) The second
self.cf_ratio
confuses me and I wander where the code shows equal probabilities? Please correct me if I'm wrong.https://github.com/D3Mlab/diffu-detox/blob/a4eaefca35806fb4ffd369534a089239442596c7/train_util.py#L265
https://github.com/D3Mlab/diffu-detox/blob/a4eaefca35806fb4ffd369534a089239442596c7/train_util.py#L297
(c) The model seems not use the
input_embs
in the batch and compute losses, it only use thenet.get_emb(input_ids)
to getinput_embs
. So I think the update embedding part can be removed.https://github.com/D3Mlab/diffu-detox/blob/a4eaefca35806fb4ffd369534a089239442596c7/train_util.py#L272-L276