teslacool / SCA

Soft Contextual Data Augmentation
Other
39 stars 9 forks source link

use of the language model layers during inference #12

Closed nicolabertoldi closed 5 years ago

nicolabertoldi commented 5 years ago

I need few clarifications.

Please confirm and/or comment about the following claims related to your software:

If any of the previous is wrong, please explain me the right process.

If I am totally right, I have a further question. Have you ever tried to infer the translation without the source and the target lm layers, i.e. using a standard transformer? Which results did you get? If you did not try, which is your feeling about such experiment?

nicolabertoldi commented 5 years ago

@teslacool

I looked at the code deeply, and I think I was wrong in my second and third claims. In practice, during the inference the source and target lm decoder are not used at all, but instead a standard transformer architecture is exploited.

Is my new claim about the inference correct?

teslacool commented 5 years ago

yes, during inference the src_tokens_lm=None and prev_output_tokens_lm=None. So this is the same as standard architecture.

nicolabertoldi commented 5 years ago

@teslacool

thanks