huggingface / huggingface-llama-recipes

531 stars 59 forks source link

Issue #46 - Quantize a Llama model and serve it using llama.cpp on google colab #58

Closed AnirudhJM24 closed 1 month ago

AnirudhJM24 commented 1 month ago

Described the Quantization process in README

Provided a notebook to serve the quantized model using llama.cpp on google colab.

AnirudhJM24 commented 1 month ago

@ariG23498 I've created a PR for #46

AnirudhJM24 commented 1 month ago

Thanks for the feedback @Vaibhavs10 !

Regarding the llama-cpp-python - I will make a new notebook for the same

AnirudhJM24 commented 1 month ago

Hi @Vaibhavs10

I have added a notebook for llama-cpp-python. Please let me know if I need to make any changes or add any other functionalities

AnirudhJM24 commented 1 month ago

Hi @Vaibhavs10

I have made the requested changes

ariG23498 commented 1 month ago

Thanks for the great notebook!