eole-nlp / eole

Open language modeling toolkit based on PyTorch
https://eole-nlp.github.io/eole
MIT License
24 stars 6 forks source link

Encoder only work #35

Closed vince62s closed 3 weeks ago

vince62s commented 1 month ago

We already implemented encoder only models but it would be great to make a recipe that fits well-known models like xlm-roberta-xl (and xxl)

xlm-roberta-large is post norm so it would require an additional change in the transformer arch.

for the 2 XL/XXL there is still some work to be done which is the learned position encoding embeddings. (see also #17)

supporting those 2 models would also make things compatible with COMET models ......