Open HumzaSami00 opened 10 months ago
@HumzaSami00 I suggest give it a try to the single gpu training as well and to get better quality from qlora paper it seems adapting all linear layers instead of just linear layers of the attention would help to get better performance.
python llama_finetuning.py --use_peft --peft_method lora --quantization --model_name /patht_of_model_folder/7B --output_dir Path/to/save/PEFT/model
Also to use HF trainer make sure to set AutoTokenizer
in the code above as codellama is not using llamatokenizer, also make sure to have HF from src pip install git+https://github.com/huggingface/transformers
.
@HamidShojanazeri , Thanks for your response. I have few questions.
llama_finetuning.py
? There is no argument for custom dataset in this script. Do I have to edit the script manually ?llama_finetuning.py
LlamaTokenizer is used as tokenizer.Edit: I tried following command and got this error. According to llama_finetune.py, it seems it accept huggingface model. But according to the readme, We can also pass path to the downloaded model.
input:
python llama-recipes/llama_finetuning.py --use_peft --peft_method lora --quantization --model_name ./CodeLlama-7b/ --output_dir result
Output:
OSError: ./CodeLlama-7b-Instruct/ does not appear to have a file named config.json. Checkout 'https://huggingface.co/./CodeLlama-7b-Instruct//main' for available files.
My downloaded Model has these files in the folder:
I am not using HF model. I downloaded model from github download.sh file.
Hi! Please read this document on how to fine-tune llama using custom data. Let me know if you have more questions!
I hope this message finds you well. I recently had the opportunity to experiment with the Codellama-7b-Instruct model from GitHub repository and was pleased to observe its promising performance. Encouraged by these initial results, I am interested in fine-tuning this model on my proprietary code chat dataset. I have single 3090 with 24GB VRAM.
To provide you with more context, my dataset has the following structure:
I have a total of 1000 such chat examples in my dataset.
Could you kindly guide me through the recommended pipeline or steps to effectively fine-tune the Codellama-7b-Instruct model on my specific chat dataset? I look forward to your guidance.
EDIT
I follow this pipeline but its giving me following error:
ERROR