I'm inspired by your nice work, "VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval" (Visualized BGE).
I would like to reproduce the model but VISTA project (FlagOpen/FlagEmbedding/visual/) doesn't have training and evaluation code unlike other projects.
It has only modelling code. Could you provide those for reproduction and test?
I'm inspired by your nice work, "VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval" (Visualized BGE).
I would like to reproduce the model but VISTA project (FlagOpen/FlagEmbedding/visual/) doesn't have training and evaluation code unlike other projects.
It has only modelling code. Could you provide those for reproduction and test?
Thank you.