lightonai / pylate

Late Interaction Models Training & Retrieval
https://lightonai.github.io/pylate/
MIT License
175 stars 7 forks source link

Use documents ids list #28

Closed NohTow closed 3 months ago

NohTow commented 3 months ago

This PR introduces changes to handle datasets with a column "document_ids" instead of using one column per document id. This results in a large speed-up and lower VRAM usage, aswell as a more readable code.