Open RomneyDa opened 2 weeks ago
Relevant docs: https://platform.openai.com/docs/guides/vision#low-or-high-fidelity-image-understanding https://docs.anthropic.com/en/docs/build-with-claude/vision#evaluate-image-size https://ai.google.dev/gemini-api/docs/vision?lang=python#prompting-images
You could set the detail value for OpenAI models to auto
instead of low
, and allow the image scaling to be determined by a model's completionOptions
config.
The capabilities of different models vary a great deal. I don't think you'd want to make the config overly complex, maybe something like imageMaxSize
with a value that represents megapixels, e.g. 1.0 would scale down images to 1000 x 1000 pixels (preserving aspect ratio), and imageQuality
, which controls the toDataURL
's quality (0.0 - 1.0).
Validations
Problem
Images are all scaled down, with no provider-specific handling for image resolution or ability to allow full resolution
@FallDown on discord
Solution
Allow defining provider and model image resolution capabilities
If provider image capabilities are known, could either NOT scale down by default and add
completionOptions
boolean forscaleImagesDown
or similar OR scale down by default and addallowFullImageResolution
option or similarOtherwise, use default (scale down or not)