Closed NohTow closed 3 months ago
This PR introduces changes to handle datasets with a column "document_ids" instead of using one column per document id. This results in a large speed-up and lower VRAM usage, aswell as a more readable code.
This PR introduces changes to handle datasets with a column "document_ids" instead of using one column per document id. This results in a large speed-up and lower VRAM usage, aswell as a more readable code.