Comprehensive Cookbook for Fine-tuning Gemma Model on Mental Health Assistant Dataset

huggingface / cookbook

Open-source AI cookbook

https://huggingface.co/learn/cookbook

Apache License 2.0

1.59k stars 213 forks source link

Comprehensive Cookbook for Fine-tuning Gemma Model on Mental Health Assistant Dataset #43

Open sitamgithub-MSIT opened 6 months ago

sitamgithub-MSIT commented 6 months ago

The cookbook aims to provide a comprehensive guide for researchers and practitioners interested in fine-tuning the Gemma model from Google on a mental health assistant dataset.

Key components of the cookbook include an introduction to the Gemma model, a description of the dataset, preprocessing steps, fine-tuning Gemma using the Hugging Face Transformers library, training procedures, and usage examples. The cookbook will also provide guidance on training parameters, etc. This will help expand the available cookbook resources for leveraging Hugging Face models in various domains.

Usage Examples: Illustrative examples demonstrate how to use the fine-tuned Gemma model for various mental health support tasks, such as response generation.

MKhalusova commented 6 months ago

Thank you for suggesting a new topic for the cookbook! A fine-tuning example of a large model can be an interesting technical notebook, however, the Mental Health use case can be a problematic example due to inherent ethical challenges of AI in sensitive and safety-critical domains such as medicine. I would suggest considering a different type of data to fine-tune the model on.

sitamgithub-MSIT commented 6 months ago

Thank you for suggesting a new topic for the cookbook! A fine-tuning example of a large model can be an interesting technical notebook, however, the Mental Health use case can be a problematic example due to inherent ethical challenges of AI in sensitive and safety-critical domains such as medicine. I would suggest considering a different type of data to fine-tune the model on.

Thanks for your suggestion! Sure, I will change the dataset then. I am then thinking about fine-tuning the base 2B model with the DataBricks Dolly dataset. Same as what the Google team showed here. But of course, this is done by using Keras NLP, so I will implement the same using huggingface offerings. Is that okay, then?

MKhalusova commented 6 months ago

@sitamgithub-MSIT sounds good to me! You can do this with transformers and peft. if you'd like, you can also add quantization into the mix.

sitamgithub-MSIT commented 6 months ago

@sitamgithub-MSIT sounds good to me! You can do this with transformers and peft. if you'd like, you can also add quantization into the mix.

Yeah. Thanks for the suggestions! I will soon be starting to work on this.