OpenBMB / VisRAG

Parsing-free RAG supported by VLMs
Apache License 2.0
379 stars 29 forks source link

Inquiry about Training Dataset Modifications. #16

Closed MinGiSa closed 1 week ago

MinGiSa commented 2 weeks ago

Thank you for sharing such quality work. I have a question about the training dataset. In a previous response, you mentioned, #5 "The data type of the query field is text, the data type of the image field is bytes, and the data type of the source field is text, indicating which dataset this sample comes from."

  1. Rather than using a query, could we include information or a class label about the image itself for training? For example, a description like, "This image shows a dog," and so on.

  2. If it’s not possible to change the query to this format, would adding the class to the source field be an option?

Thank you for your response.

tcy6 commented 2 weeks ago

Q: Rather than using a query, could we include information or a class label about the image itself for training? For example, a description like, "This image shows a dog," and so on. A: We do not recommend that you do this because, considering the RAG scenario, we usually use a Query to query related documents to help the model answer questions. If you use Description to train VisRAG-Ret, there will be a gap. Q: If it’s not possible to change the query to this format, would adding the class to the source field be an option? A: Sorry I don't understand what you said, could you elaborate on it?

MinGiSa commented 1 week ago

@tcy6 Thank you for the quick response. To be more specific about the second question, instead of documenting sources like Neurips papers in the source field, I am referring to writing the class of the given image - for example, writing "dog" as the class when a dog image is provided

tcy6 commented 1 week ago

@tcy6 Thank you for the quick response. To be more specific about the second question, instead of documenting sources like Neurips papers in the source field, I am referring to writing the class of the given image - for example, writing "dog" as the class when a dog image is provided

@MinGiSa Oh, I see. Sorry. Since the original datasets don’t include the image class information, we’re unable to directly write the class (like "dog") in the source field based on the original datasets.