Open fym0503 opened 2 years ago
Hi,
It's true, but as long as there are no pretrained weights, chances are small models are added. There are some open-source implementations of MoE available, including:
Once there are some pretrained weights available somewhere, be sure to let us know!
Hi, I find there is a very recent implementation of MoE with code and pretrained weights for your reference https://github.com/pytorch/fairseq/tree/main/examples/moe_lm
Indeed, hot of the press! Let's add them!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Any progress on this front ? Thanks...
cc'ing @patil-suraj here
Hey @cerisara ! We have planned to add the moe_lm
in Transformers
but I don't have much bandwidth to work on it. If you or anyone else in the community is interested in adding it, I would be more than happy to help :)
Hi @patil-suraj, thanks, I'm interested in these models and would like to contribute, but I'm afraid my bandwidth is too small as well, at least for now, sorry ;-)
Hi, I find that there are emerging works in the field of NLP on Mixture of experts based models, such as Switch Transformers from Google. However, I do not find such mixture of expert models in huggingface transformers. Do you have the plan to support such models? Thanks !