Open kdziedzic68 opened 1 month ago
I believe "FewShotExample type should be extended with optional field representing list of input images" is not necessary. FewShotExample
object already contains the input model object, which already contains images (for prompts with images).
So we only need to make sure that the existing API for providing few show examples works as expected:
prompt.add_few_shot(SongData(name="Alice", age=30, theme="pop", cover_image=image_data), "It's a really catchy tune.")
Currently cover_image
will be ignored. It should be used and added to the conversation as an image (provided that is present in image_input_fields
)
usage of list_few_shots should be moved to LLM method: _format_chat_for_llm because of deciding whether the given model supports vision
This is complicated. I think it would be good to discuss during grooming what kind of data should Prompt's chat()
method return:
LLM
s role to change this to a format needed by the LLM model and to decide which elements to use. LLM
s role to obtain other elements (like images) by calling prompt's methods separately and try to integrate them with the textual conversation.At the beginning of the project we discussed between 1 and 2 and decided to go with 1. Adding images showed some disadvantages of 1 (prompt alone cannot know what the particular LLM model can handle).
Currently (with the latest PR adding images to prompt and with how this ticket is written) we seem to be going the route of 3. I'm not convinced it's the best route - it seems quite wobbly (for example: knowing which image to add to which element of the conversation). I'm not convinced it's the best route - I think it would be worth revisiting our previous discussion as the team.
@mhordynski I believe you wanted to read through the comments here and decide on one of the options
Feature description
Action items:
FewShotExample
type should be extended with optional field representing list of input imageslist_few_shots
of classPrompt
should be able to recognize whether the given few shot entry contains imageslist_few_shots
should be moved toLLM
method:_format_chat_for_llm
because of deciding whether the given model supports visionMotivation
Users would need to create few shot learning systems for image processing - eg. classification etc.
Additional context
No response