dmis-lab / AdvSR

Adversarial Subword Regularization forRobust Neural Machine Translation
MIT License
10 stars 3 forks source link

About the training process of the model #2

Closed Thovenfish closed 3 years ago

Thovenfish commented 3 years ago

I am not very clear about the training process. Cound you please solve my confusion? What I understand is that the adversarial segmentation datasets are first generated as preprocessing, and then the NMT model is trained based on this data set. But I see that "Our method seeks adversarial segmentations onthe-fly, thus the model chooses the subword candidates that are vulnerable to itself according to the state of the model at each training step." in paper. So I feel confused about the training process. Thanks you ~

JJumSSu commented 3 years ago

Sorry for the late reply.

the segmentation candidates per each token (not adversarial, but possible candidates, written in https://github.com/dmis-lab/AdvSR/blob/828e8460bf82454654a90a8579bbcbecafc861c1/fairseq/advsr.py#L7) are first created through pre-processing step,

and then among these candidates, the single segmentation is chosen on-the-fly during training process using gradient signals (written in https://github.com/dmis-lab/AdvSR/blob/master/fairseq/tasks/fairseq_task.py#L213-L340)

Thank you :)

Thovenfish commented 3 years ago

Thank you very much~ The reply is very detail and clear.