Open tibrezus opened 3 months ago
I was thinking roughly the same thing. Curious to see Grok running on petals. If possible.
related in regards of MOE's (mixture of experts): https://github.com/bigscience-workshop/petals/issues/548
There is a possibility to implement a distributed model for Grok based on this unofficial transformers implementation of it https://huggingface.co/keyfan/grok-1-hf
Grok architecture and weights were just released, do Petals support, or is it in plan to support Grok and MOE models? Having a first in class 314B parameter model running on consumer hardware would be great! Thanks in advance.