Closed yuejunpeng closed 4 months ago
There are two approaches that can address this case:
.doc(..)
function of FLMRModelForRetrieval
, and change the pixel_values
to support multiple images per doc. And you need to modify the function in FLMRModelForIndexing
to pass in multiple images per document to the model.You can make your choice based on the trade-off. Just to mention, it would be better if the model can be finetuned on Image+Text->Image+Text. This use may be sub-optimal.
We plan to release a finetuning script very soon.
I followed the instructions about custom document in the readme. `## Create document collections num_items = 100
feature_dim = 1664
However, if it can support the input of multiple images, it would be more suitable for me. For the document, each item includes a text content and multiple images. For the query, each item includes a text content and a image. Is this possible? If so, how should it be modified? Thank you sincerely!