GGML / GGUF formats - Githubissues

AI4Bharat / IndicTrans2

Translation models for 22 scheduled languages of India

https://ai4bharat.iitm.ac.in/indic-trans2

MIT License

230 stars 65 forks source link

GGML / GGUF formats #35

Open damodharanj opened 10 months ago

damodharanj commented 10 months ago

The model is pretty amazing and thanks a lot for open sourcing it. Is there a way to size it down and run in hardwares like Apple silicon using ggml ? GGML Would this improve the inference times ? For me in Apple M2 it takes 12 seconds to translate 1 sentence. If you can guide me to do this would be willing to help!

jaygala24 commented 10 months ago

Hi @damodharanj

Yes, this should improve the inference time. However, it would require you to write the model definitions in C++ similar to llama.cpp and convert it to ggml.

Currently, we don't have the bandwidth, experience and hardware resources to help you port the models to ggml. Please let us know if there is any progress on this thread.

damodharanj commented 10 months ago

thanks a lot for the response! Sure will update what I can do from my end once I take this up