Closed GCazottes closed 3 weeks ago
Hey!
I don't think this is a bad idea per se, however, it's more of a research topic than within the scope for this lib, so I'll close the issue for now. However, I do encourage you to pursue this as a nice research demo and would be happy to help out if I can :)
Hi everyone, thank you very much for the work !
For documents that use a lot of acronyms or specific terminology, it would be useful if we could add custom text to the image during the embedding process. This custom text could serve multiple purposes, such as a summary of the entire document or definitions of acronyms or other key terms present on the page.
The goal is to leverage the attention mechanism between this added text and the page. Specifically, the model should be able to focus on this extra text (e.g., acronym definitions) while processing the embedded page, improving retrieval performance.
I'm not sure if the model has been trained to handle this, or if it's already implemented (in this case I'm sorry for this useless issue, but I didn't find this functionnality).
Thank you !