juncongmoo / chatllama

ChatLLaMA 📢 Open source implementation for LLaMA-based ChatGPT runnable in a single GPU. 15x faster training process than ChatGPT
1.2k stars 138 forks source link

Support Huggingface Transformer? #3

Closed Starlento closed 1 year ago

Starlento commented 1 year ago

Few days ago, they published the model to huggingface, which means there is no need to submit a form now. https://huggingface.co/decapoda-research/llama-65b-hf And the huggingface transformer is already implemented https://github.com/zphang/transformers/tree/llama_push, maybe it will be in the package formally in the near future. I just wonder whether it is time consuming to support/change to this pipeline...

Another thing is about quantization, I found this repo and execute the benchmarks there, quite interesting. I am not an expert... Just as a reference https://github.com/qwopqwop200/GPTQ-for-LLaMa.

vo2021 commented 1 year ago

Thanks for the info, but those repos are not stable. I tried none of them really works.