🤗 [REQUEST] - LLaMa2 7B HHRLHF QLoRA

bigcode-project / bigcodebench

BigCodeBench: The Next Generation of HumanEval

https://bigcode-bench.github.io/

Apache License 2.0

99 stars 6 forks source link

🤗 [REQUEST] - LLaMa2 7B HHRLHF QLoRA #5

Closed Sharan1712 closed 18 hours ago

Sharan1712 commented 1 week ago

Model introduction

This model is created by Sharan as an experiment for my thesis. I am working on testing different Quantization and PEFT combinations.

Model URL

https://huggingface.co/Sharan1712/llama2_7B_hhrlhf_qlora_4bit_1d

Additional instructions (Optional)

The model is quantized and merged with LoRA after finetuning.

Author

Yes

Security

[X] I confirm that the model is safe to run which does not contain any malicious code or content.

Integrity

[X] I confirm that the model comes from unique and original work and does not contain any plagiarism.

terryyz commented 3 days ago

Hi @Sharan1712, is this a base model but with further finetuning?

Sharan1712 commented 3 days ago

Hi @terryyz, yes I took https://huggingface.co/meta-llama/Llama-2-7b-hf, quantized it, added LoRA layers, and finetuned the LoRA layers using HHRLHF dataset. I am working on a small project of Quantization & PEFT methods.

terryyz commented 3 days ago

Thanks for the clarification! So I guess this will be an instruction-tuned model as you used hh-rlhf. However, I saw you didn't include the chat template in the tokenizer config, which would make the evaluation use direct completion. In addition, missing the chat template means that you can't evaluate the model on the BigCodeBench-Instruct.

Do you want me to add your results to the leaderboard? Or do you just want to do the eval? If it's for the eval only, you can also do it by following the steps documented in the README :)

terryyz commented 18 hours ago

Closed the issue for now. Feel free to reopen it if you have further questions :)