google-gemini / generative-ai-python

The official Python library for the Google Gemini API
https://pypi.org/project/google-generativeai/
Apache License 2.0
1.62k stars 322 forks source link

Gemini 1.5 Flash Supervised Fine Tuning Updates #528

Open bgrove-s7 opened 2 months ago

bgrove-s7 commented 2 months ago

Description of the feature request:

Increase character limitation in tuning jobs to take advantage of the Flash 1.5 1,000,000 token window.

What problem are you trying to solve with this feature?

Gemini 1.5 Flash has a very large token window, which potentially makes it ideal for extracting needles from haystacks of text. We would like to fine tune Gemini 1.5 to perform this task for us. Tuning is appropriate because the content we are typically examining is hundreds of thousands of tokens, leaving no room for multishot prompting techniques.

Any other information you'd like to share?

No response

MarkDaoust commented 2 months ago

What is the current limit?

bgrove-s7 commented 2 months ago

It appears to be 40,000 characters, per the following error:

CreateTunedModelRequest.tuned_model.tuning_task.training_data.examples.examples[9].text_input: text_input is too long. The maximum character count accepted is 40000.