Open laurids-reichardt opened 1 year ago
No it isn't yet. We would need to port the bigcode example over from the ggml Repo. But currently we are working on getting gpu support for all models, which will cause some structural changes. We should probably port the model after we implemented gpu support.
Anyone knows if this will work yet? the huggingface model page says llm
is supported, but I'm not sure after reading this thread.
My other option is to use candle
but it requires a lot of boilerplate code at the moment.
@jondot Theoretically you should be able to run it with the gpt2
architecture, but i haven't tested that yet. If you want give it a try and let me know if it works. And make sure you are using the external tokenizer, as the integrated one will probably produce gibberish.
Running TheBloke/starchat-beta-GGML with
llm-cli
does not seem to produce any useful output:Contents of
starchat_prompt_template.txt
:Commit: 4b6366c3641217c96e7304edd5478b9e6743997b OS: macOS Ventura Processor: M1 Max 32GiB