eole-nlp / eole

Open language modeling toolkit based on PyTorch
https://eole-nlp.github.io/eole
MIT License
24 stars 6 forks source link

Add support for XLM-Roberta-XL (and XXL) conversion #41

Closed vince62s closed 4 weeks ago

vince62s commented 1 month ago

This is not a full support for xlm-roberta-xl. The goal is to support the encoder part (not the lm_head / masking / classification) to add the estimator and build a comet-like model. As is, the encoder output is the same as the hugging face model/code.

@funboarder13920 this breaks even more #26

vince62s commented 4 weeks ago

Just for reference. as is now, here: https://github.com/eole-nlp/eole/blob/main/eole/inputters/text_utils.py#L86-L102 I have handled only the xlm-roberta-xl case in the context of COMETKIWI. for training, numericalize will take SRC and TGT and encode BOS + TGT + EOS + EOS + SRC + EOS for inference numericalize will take SRC and encode EOS + SRC + EOS which means that SRC needs to be the concat of MT HYP + EOS + EOS + SRC

We may want to generalize better here but it was easier to test easily this use case for now.