Remove unnecessary masking in colbert_score and use broadcasting for masking during training

lightonai / pylate

Late Interaction Models Training & Retrieval

https://lightonai.github.io/pylate/

MIT License

158 stars 7 forks source link

Remove unnecessary masking in colbert_score and use broadcasting for masking during training #10

Closed NohTow closed 4 months ago

NohTow commented 4 months ago

As raised during a discussion, there is no point in masking padding tokens if we create them by filling with zeros (they will always yield 0 cos sim), so I removed this part in the colbert_score function.

Since during training, the padding tokens are not zeros but padding tokens processing by the models, I let the masking, but I changed it to use broadcasting instead of creating the mask explicitly to save some VRAM.