Closed liuqi6777 closed 5 months ago
This error comes from pytorch, input tensor too large. A bit strange. Is your dataset extremely large, like over 100M docs?
This error comes from pytorch, input tensor too large. A bit strange. Is your dataset extremely large, like over 100M docs?
It has 170k documents and doesn't seem very large.
Okay I suggest googling that error to understand why pytorch doesn’t like this.
alternatively, try indexing with the official checkpoint — in general 768 is way too large for these vectors, you should is 64, 128, 256 (power of two in this range)
Thanks for your reply! I will try to fix it based on your suggestion, and if there are any new situations, I will update this issue :)
Hello, I am trying to index some documents using the ColBERT model finetuned by myself. For some reason, I set the output dim of embeddings to 768 instead of the original 128, but I got this error when I ran the indexing code:
Here is my indexing code:
I can correctly index the documents when using the colbertv2.0 checkpoint, so I guess the output dim of embeddings is the reason I got this error. I have tried to set
nbits
to larger numbers like 4 or 8, but it didn't work. So how can I solve this problem?Thanks in advance!