Add support for XLM-Roberta-XL (and XXL) conversion

eole-nlp / eole

Open language modeling toolkit based on PyTorch

MIT License

24 stars 6 forks source link

Just for reference. as is now, here: https://github.com/eole-nlp/eole/blob/main/eole/inputters/text_utils.py#L86-L102 I have handled only the xlm-roberta-xl case in the context of COMETKIWI. for training, numericalize will take SRC and TGT and encode BOS + TGT + EOS + EOS + SRC + EOS for inference numericalize will take SRC and encode EOS + SRC + EOS which means that SRC needs to be the concat of MT HYP + EOS + EOS + SRC

We may want to generalize better here but it was easier to test easily this use case for now.

eole-nlp / eole

Add support for XLM-Roberta-XL (and XXL) conversion #41