Open kk2491 opened 3 months ago
@kk2491 can you please let me know which notebook you ran?
Hi @gericdong I am using the below notebook
model_garden_pytorch_llama2_peft_finetuning.ipynb
Thank you,
KK
@genquan9: can you please assist with this? Thank you.
If you do training from HF datasets, you can input sth like: timdettmers/openassistant-guanaco directly.
but, if you use dataset json stored in gcs, you should use the format as:
{"input_text":"TRANSCRIPT: \nREASON FOR EVALUATION:,\n\n LABEL:","output_text":"Chiropractic"}
The team is verifying the notebook with pipelines again.
@genquan9 Thanks for the response.
I am not using the dataset from the GCP bucket.
I have created my own dataset in huggingface following the format of timdettmers/openassistant-guanaco, you can find the dataset here.
Thank you,
KK
@genquan9 @gericdong Sorry to bother you. Did you get chance to look into the above issue?
Thank you, KK
Hi @kk2491, I was able to reproduce the issue. Please try again but set the evaluation_limit to 100.
@jismailyan-google Thanks for the suggestion. Just out of curiosity, did you also try with my dataset (from here)?
Thank you,
KK
@jismailyan-google Looks like the notebook for vertex-ai pipeline has been removed.
However I did try the fine-tuning with evaluation_limit
set to 100, the error remains the same.
@genquan9 @gericdong Did you get chance to look into the above issue?
Thank you, KK
Hi @kk2491,
I was able to get the tuning completed with your dataset.
You can try this out, just replace the PIPELINE_ROOT_BUCKET
with your GCS bucket and the SERVICE_ACCOUNT
with your own.
Also, please note the updated COMPILED_PIPELINE_PATH
.
COMPILED_PIPELINE_PATH = "https://us-kfp.pkg.dev/ml-pipeline/google-cloud-registry/oss-peft-llm-tuner/sha256:2e723d2eccb84d28652dd73324e0bf5dc7179f2ddb4230853cb95b0428438eb0"
pipeline_parameters = {
"base_model_name": "Llama-2-7b",
"dataset_name": "kk2491/test",
}
# Define and launch the Pipeline Job.
job = aiplatform.PipelineJob(
display_name='llama2-tuner-04042024',
template_path=COMPILED_PIPELINE_PATH,
pipeline_root=PIPELINE_ROOT_BUCKET,
parameter_values=pipeline_parameters,
)
job.submit(service_account=SERVICE_ACCOUNT)
Let me know if this works.
@jismailyan-google I tried again this time with Vertex GUI (looks like the notebook for fine-tune with vertex-ai has been removed).
As per the comments provided by you, I dont have to make any changes in parameters except BUCKET
and SERVICE_ACCOUNT
. Hence tried with all default values, however the results remain the same.
Now I am 100% sure that I am doing some silly mistake here.. !!!
I am running into the same error when trying to specify a custom dataset:
# Hugging Face dataset name or gs:// URI to a custom JSONL dataset.
dataset_name = "gs://llama-fine-tuning/training_data.jsonl" # @param {type:"string"}
# Name of the dataset column containing training text input.
instruct_column_in_dataset = "text" # @param {type:"string"}
# Optional. Template name or gs:// URI to a custom template.
template = "" # @param {type:"string"}
I haven't looked, but I suspect that the image running the instruct-lora
task is trying to load the gs:// URI as a huggingface dataset? Something like this: https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/community-content/vertex_model_garden/model_oss/peft/instruct_lora.py#L27.
I saw the following comment by @genquan9:
If you do training from HF datasets, you can input sth like: timdettmers/openassistant-guanaco directly.
but, if you use dataset json stored in gcs, you should use the format as:
{"input_text":"TRANSCRIPT: \nREASON FOR EVALUATION:,\n\n LABEL:","output_text":"Chiropractic"}
I haven't tried this yet, but it seems that the instruct lora task needs to account for gs:// URI somehow. Does it?
@Joshwani-broadcom Here is how I was able to fix the error. (Worth giving a try, if not tried yet)
Human
and Assistant
conversation. Looks like all of your samples are getting dropped due to one of the above reasons.
You can also find more details here. By following this I was able to fix the error, and fine-tune the llama2 model successfully.
Kindly let me know if you face any other issues.
Thank you,
KK
Thank you @kk2491 - Is it true that you are using a huggingface dataset? Did you ever find success using a gs:// uri in the notebook like this:
dataset_name = "gs://llama-fine-tuning/training_data.jsonl"
?
Yea initially I tried with huggingface dataset and got it working. Later with the same dataset I migrated to Google bucket, it worked as expected.
Thank you,
Kk
Expected Behavior
The fine-tuning of the foundation model should complete without any issues.
Actual Behavior
The fine-tuning step gets terminated. The details provided below:
Training framework - Google collab
Model used - Llama2-7B
Fine-tuning method - PEFT Number of samples in Training Set - 100 Number of samples in Eval Set - 20
Format of the training data - jsonl Example sample is given below -
Vertex pipeline parameters :
When I execute the training process, I get the below error:
Can you please help in understanding the below question?
Steps to Reproduce the Problem
Specifications