Open nickthecook opened 8 months ago
I'll take a look at this. First step is probably to allow uploading a single .jpg file, possibly compressing it in the chunking step. Then including it in chat requests.
Athough this issue may be something slightly different, since you say they should be searched by 'descriptions or surrounding text,' which sounds like you wouldn't be sending the image to the llm. In your case the similarity search would be on the descriptions, and the images returned in the response. Any preference on which to do first?
Would be helpful to be able to treat images as separate documents, and search them based on descriptions or surrounding text from the PDF. These could be presented to the user along with the LLM response.
https://github.com/yob/pdf-reader/blob/main/examples/extract_images.rb