Closed abhinav-kashyap-asus closed 5 months ago
Yes, we plan to add Google Gemma soon. If you have any specific models you'd like to see added, feel free to make a request.
Regarding developer documentation, we do not have it available yet, but it is on our roadmap to add soon. In the meantime, here is a quick guideline for adding models:
Copy the model file to mergoo/models
.
Identify and replace the linear layers that can be converted to MOE/ LORA-MOE. You can use the convert_linear_to_moe
function from mergoo/compose_layers.py
for this purpose. Here is an example:
Original layer:
self.q_proj = nn.Linear(self.hidden_size, self.num_heads * self.head_dim, bias=False)
MOE layer:
self.q_proj = convert_linear_to_moe("q_proj", config, layer_idx, self.hidden_size, self.num_heads * self.head_dim, bias=False)
name
and layer_idx
are identifiers used to determine if the layer should be replaced with an MOE layer. Refer to this example config for more details: example config. The layer names present in "router_layers" will be replaced by MOE layers if the layer index falls within "router_layers_index".
If you are interested, you can find the implementations of MOE and LORA MOE layers here.
@abhinav-kashyap-asus, thank you for your interest in mergoo! We're looking forward to your pull request! 🙃
Thanks for the reply :) Will try adding a model other than Gemma. Any other models that are in your pipeline, that I can try to add??
Here is a superset of models supported by Hugging Face. Feel free to try any of these : https://github.com/huggingface/transformers/tree/main/src/transformers/models
Hi everyone. Thanks for the wonderful work here. I am yet to try this out. But, I am excited about it.
Right now I see that you have certain models that are supported. Are there plans to add more models? Are there any developer docs that one can follow to make adding new models easier. That would be helpful
Thank you