meta-llama / PurpleLlama

Set of tools to assess and improve LLM security.
Other
2.73k stars 453 forks source link

Fine tuning for additional policies LlamaGuard #16

Closed harindermashiana closed 7 months ago

harindermashiana commented 8 months ago

Can you please share some details about fine-tuning LlamaGuard for additional categories? Specifically, every time we need to add one additional category to the existing model, do we need to fine-tune on the old dataset(old categories) + new dataset(new category) or do we just need to fine-tune on the new dataset?

Can you please share a few bullet points/steps to do the fine-tuning and which files to reference based on your answer?

Thanks,

ujjwalkarn commented 7 months ago

Hi, you can fine tune on just the new dataset with a low learning rate (2e-6 or even lower). For fine-tuning, any llama fine-tuning recipe (for example, from Hugging Face) should work. Please let us know if you have additional questions!