Open sadik-abd opened 1 year ago
There's bert.cpp if you want to run BERT models. But everything has architecture specifics so you have to wait for someone to write a layer/inference implementation. It's actually quite complicated and unfortunately undocumented.
I want to implement my custom models with ggml so that I can use that in c++. how can I do that is there any documentation for ggml and how can I use that
No, there is no "documentation", the closest thing is just the GGML examples. They are not really simple.
I don't have access to gpt-4 but if there is someone who have access to the 32k version (or maybe 8k) and try to make the model infer the common steps in converting the architecture and abstracting them in such a way that a person should only check which layers to convert, that would be amazing.
Because I don't mind converting and supporting models like the upcoming falcon.cpp, what I do mind however is that this library is an important step into democratizing access to everyone! And it doesn't has a proper documentation. This project is getting bigger, @ggerganov we need your help here. I'm not a C++ expert but I'm willing to help if we somehow we manage to make this more "general".
Is there any way or guide to convert models like LayoutLM, RoBERTa, T5, etc. and as well as my own torch models to ggml.