How to Run DocRepair for Inference?

lena-voita / good-translation-wrong-in-context

This is a repository with the data and code for the ACL 2019 paper "When a Good Translation is Wrong in Context: ..." and the EMNLP 2019 paper "Context-Aware Monolingual Repair for Neural Machine Translation"

97 stars 18 forks source link

How to Run DocRepair for Inference? #10

Closed styfeng closed 4 years ago

styfeng commented 4 years ago

Hi there,

Thanks for the code! We are attempting to use DocRepair for a non-translation problem. We are wondering, after we train DocRepair on our own data, how do we use the trained model for inference? For example, given a file of texts (each of which we wish to "repair), how do we use DocRepair and have it generate the repaired versions of each text? We have looked through the Jupyter notebooks and it seems they are for running the translation models (e.g. CADec) rather than DocRepair.

Thanks

lena-voita commented 4 years ago

Hi!

DocRepair is trained to "repair" sentence-level translations and works in the target language. This is how you use it: 1) translate each sentence independently using a sentence-level NMT model 2) concatenate translations using a special token-separator (the same you used in training) 3) feed it to DecRepair as input.

Note that DocRepair works as standard NMT model - the inly thing you have to do is to give the proper input.

For more details, read the paper: https://www.aclweb.org/anthology/D19-1081.pdf