Grok | Mixture-of-Experts | Model Support

bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

https://petals.dev

MIT License

8.89k stars 490 forks source link

Grok | Mixture-of-Experts | Model Support #564

Open tibrezus opened 3 months ago

tibrezus commented 3 months ago

Grok architecture and weights were just released, do Petals support, or is it in plan to support Grok and MOE models? Having a first in class 314B parameter model running on consumer hardware would be great! Thanks in advance.

davidearlyoung commented 3 months ago

I was thinking roughly the same thing. Curious to see Grok running on petals. If possible.

davidearlyoung commented 3 months ago

related in regards of MOE's (mixture of experts): https://github.com/bigscience-workshop/petals/issues/548

RuslanPeresy commented 3 months ago

There is a possibility to implement a distributed model for Grok based on this unofficial transformers implementation of it https://huggingface.co/keyfan/grok-1-hf