Developer Documentation to contribute new models

abhinav-kashyap-asus commented 5 months ago

Hi everyone. Thanks for the wonderful work here. I am yet to try this out. But, I am excited about it.

Right now I see that you have certain models that are supported. Are there plans to add more models? Are there any developer docs that one can follow to make adding new models easier. That would be helpful

Thank you

gitsailor5 commented 5 months ago

Yes, we plan to add Google Gemma soon. If you have any specific models you'd like to see added, feel free to make a request.

Regarding developer documentation, we do not have it available yet, but it is on our roadmap to add soon. In the meantime, here is a quick guideline for adding models:

Copy the model file to mergoo/models.
Identify and replace the linear layers that can be converted to MOE/ LORA-MOE. You can use the convert_linear_to_moe function from mergoo/compose_layers.py for this purpose. Here is an example:

Original layer:
```
self.q_proj = nn.Linear(self.hidden_size, self.num_heads * self.head_dim, bias=False)
```
MOE layer:
```
self.q_proj = convert_linear_to_moe("q_proj", config, layer_idx, self.hidden_size, self.num_heads * self.head_dim, bias=False)
```
name and layer_idx are identifiers used to determine if the layer should be replaced with an MOE layer. Refer to this example config for more details: example config. The layer names present in "router_layers" will be replaced by MOE layers if the layer index falls within "router_layers_index".
If you are interested, you can find the implementations of MOE and LORA MOE layers here.

gitsailor5 commented 5 months ago

@abhinav-kashyap-asus, thank you for your interest in mergoo! We're looking forward to your pull request! 🙃

abhinav-kashyap-asus commented 5 months ago

Thanks for the reply :) Will try adding a model other than Gemma. Any other models that are in your pipeline, that I can try to add??

gitsailor5 commented 5 months ago

Here is a superset of models supported by Hugging Face. Feel free to try any of these : https://github.com/huggingface/transformers/tree/main/src/transformers/models

Leeroo-AI / mergoo

Developer Documentation to contribute new models #2