zou-group / textgrad

TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.
http://textgrad.com/
MIT License
1.45k stars 114 forks source link

Can I use other models except claude and gpt to do the multimodal work? #60

Open DylanDDeng opened 1 month ago

DylanDDeng commented 1 month ago

Hi, I have a question for the multimodal work. When I set the engine as Yi-vision from 01-ai, it shows the error as follows

ValueError: The engine provided is not multimodal. Please provide a multimodal engine, one of the following: ['gpt-4-turbo', 'gpt-4o', 'claude-3-5-sonnet-20240620', 'claude-3-opus-20240229', 'claude-3-sonnet-20240229', 'claude-3-haiku-20240307', 'gpt-4-turbo-2024-04-09']

I think the model Yi- vision is multimodal but I am not sure the multimodalLLMCall support custom model?

mertyg commented 1 month ago

We do not yet support Yi (or other models besides claude/gpt-4 for multimodal) -- but this should be fairly easy to implement! It should basically mimic the structure of e.g., ChatAnthropic class. If someone implements this it'd be great! If not we'd get to this hopefully soon.