bigcode-project / bigcodebench

BigCodeBench: The Next Generation of HumanEval
https://bigcode-bench.github.io/
Apache License 2.0
99 stars 6 forks source link

🤗 [REQUEST] - LLaMa2 7B HHRLHF QLoRA #5

Closed Sharan1712 closed 18 hours ago

Sharan1712 commented 1 week ago

Model introduction

This model is created by Sharan as an experiment for my thesis. I am working on testing different Quantization and PEFT combinations.

Model URL

https://huggingface.co/Sharan1712/llama2_7B_hhrlhf_qlora_4bit_1d

Additional instructions (Optional)

The model is quantized and merged with LoRA after finetuning.

Author

Yes

Security

Integrity

terryyz commented 3 days ago

Hi @Sharan1712, is this a base model but with further finetuning?

Sharan1712 commented 3 days ago

Hi @terryyz, yes I took https://huggingface.co/meta-llama/Llama-2-7b-hf, quantized it, added LoRA layers, and finetuned the LoRA layers using HHRLHF dataset. I am working on a small project of Quantization & PEFT methods.

terryyz commented 3 days ago

Thanks for the clarification! So I guess this will be an instruction-tuned model as you used hh-rlhf. However, I saw you didn't include the chat template in the tokenizer config, which would make the evaluation use direct completion. In addition, missing the chat template means that you can't evaluate the model on the BigCodeBench-Instruct.

Do you want me to add your results to the leaderboard? Or do you just want to do the eval? If it's for the eval only, you can also do it by following the steps documented in the README :)

terryyz commented 18 hours ago

Closed the issue for now. Feel free to reopen it if you have further questions :)