facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.42k stars 6.4k forks source link

XLM model size #1502

Closed NProkoptsev closed 4 years ago

NProkoptsev commented 4 years ago

https://github.com/pytorch/fairseq/tree/master/examples/xlmr xlm.base.tar.gz weights 2.4gb, while xlm.large.tar.gz weight 900mb It seems that base model has unnecessary optimizer state

myleott commented 4 years ago

Nice catch, I replaced it with the stripped version (without optimizer state).