Closed Sharan1712 closed 18 hours ago
Hi @Sharan1712, is this a base model but with further finetuning?
Hi @terryyz, yes I took https://huggingface.co/meta-llama/Llama-2-7b-hf, quantized it, added LoRA layers, and finetuned the LoRA layers using HHRLHF dataset. I am working on a small project of Quantization & PEFT methods.
Thanks for the clarification! So I guess this will be an instruction-tuned model as you used hh-rlhf. However, I saw you didn't include the chat template in the tokenizer config, which would make the evaluation use direct completion. In addition, missing the chat template means that you can't evaluate the model on the BigCodeBench-Instruct.
Do you want me to add your results to the leaderboard? Or do you just want to do the eval? If it's for the eval only, you can also do it by following the steps documented in the README :)
Closed the issue for now. Feel free to reopen it if you have further questions :)
Model introduction
This model is created by Sharan as an experiment for my thesis. I am working on testing different Quantization and PEFT combinations.
Model URL
https://huggingface.co/Sharan1712/llama2_7B_hhrlhf_qlora_4bit_1d
Additional instructions (Optional)
The model is quantized and merged with LoRA after finetuning.
Author
Yes
Security
Integrity