Aligning with DPO a Gemma 2B model.

google-gemini / gemma-cookbook

A collection of guides and examples for the Gemma open models from Google.

https://ai.google.dev/gemma/

Apache License 2.0

668 stars 123 forks source link

Aligning with DPO a Gemma 2B model. #34

Closed peremartra closed 3 months ago

peremartra commented 3 months ago

Description of the feature request:

I want to create a new notebook for the cookbook showing how to align a Gemma model using DPO with a public dataset from Hugging Face.

I'm going to use this notebook as a base: Aligning_DPO_open_gemma-2b-it.ipynb

but with more explanations and adapting the format to meet the requirements of the cookbook.

What problem are you trying to solve with this feature?

No response

Any other information you'd like to share?

No response

windmaple commented 3 months ago

This looks interesting. Could you 1) follow the steps here?

2) come up with another inference demo, since the current one gives the wrong answer mathematically, although it follows the instruction?

peremartra commented 3 months ago

Thanks @windmaple.

The model adapts its behavior to the alignment process. In the response from the unaligned model, the instruction to return only numbers is ignored. I will adapt the inference example to an operation that returns the correct value or look for another example.

On the other hand, I'm not sure whether to keep the code that publishes the model on Hugging Face as an example, or if it would be better to remove it and focus the notebook solely on the DPO process.

windmaple commented 3 months ago

On the other hand, I'm not sure whether to keep the code that publishes the model on Hugging Face as an example, or if it would be better to remove it and focus the notebook solely on the DPO process.

Yes, pls keep this part of the code. We want to make it as easy as possible to publish any finetuned model.

windmaple commented 3 months ago

Thank you for your contribution👍👍! We will try to get it featured on Google's social account soon.

peremartra commented 3 months ago

Thanks to you! A pleasure!