Open Ranking666 opened 3 weeks ago
Is there a complete example?
For now there isn't a great cookbook. MMMU.ipynb shows the basics.
I am working on a better cookbook. If there is something in particular you want to see let me know and I can see how to fit it in
An example with a list of input images, where the list size can vary, would be great.
What's the downstream use case here? (Not disagreeing just curious)
My use case is multi-page PDFs where the number of pages can vary, so I can’t predetermine a fixed number of input images (I convert multiple pages to a single image, but this still requires a flexible list size).
Are there any examples that support local multimodal large models? Including data format, model loading method