stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
MIT License
3.06k stars 388 forks source link

Decompression returns zero vectors #141

Closed danielfleischer closed 1 year ago

danielfleischer commented 1 year ago

Hi, when reconstructing vectors from codes and residuals I always get zero vectors. The relevant code is related to the torch extensions. See minimal example:

from colbert.indexing.codecs.residual import ResidualCodec, ResidualEmbeddings

codec = ResidualCodec.load("/path/to/index/")

a = ResidualEmbeddings(torch.Tensor([1,2,3]),
                       torch.randint(256, (3, 32),
                                     dtype=torch.uint8))

codec.decompress_residuals(a.residuals,
                           codec.bucket_weights,
                           codec.reversed_bit_map,
                           codec.decompression_lookup_table,
                           a.codes,
                           codec.centroids,
                           codec.dim,
                           codec.nbits)

this returns zero vectors. The index has content (examine the pt files).

Did anyone encounter this? is this a bug or an issue with GPU drivers?

Thanks!

okhat commented 1 year ago

Do indexing and search work for you? What index are you using for the codec above?

We don't generally expose the decompression API directly, so I'm trying to see what you are trying to achieve.

danielfleischer commented 1 year ago

Indexing works. Searching works but it's subtle; candidates are collected using the heuristic of using the centroids as proxies. Next, we want to reconstruct the full vectors for ranking + topK. Here I get zero vectors (see issue); the vectors become NaN after normalization, ranking doesn't do anything and I get topK documents which look relevant (heuristics work) but the scores are NaN, which started the whole debugging and led to me to pin point what seems to be the issue.

okhat commented 1 year ago

Are you doing this to debug the standard Searcher, or are you trying to re-implement your own search? That wasn't clear to me.

danielfleischer commented 1 year ago

I want to use the library and am debugging the current code. I got Nan scores when searching and from there I saw the decompression returns zero vectors.

okhat commented 1 year ago

Thanks. I've never seen that before. Are you using the checkpoint we provide? What information can you provide about your collection: how many passages, passage language, lengths, etc.

okhat commented 1 year ago

Information about the hardware will also be helpful.

okhat commented 1 year ago

Also. Did you try the provided example in the Jupyter notebook? Does that work and give you non-NaN scores?

danielfleischer commented 1 year ago

Hi, here is the information you required:

Maybe something is wrong with the index. Are there any sanity checks we can do to make sure the index is fine? thanks!

okhat commented 1 year ago

This collection is heavily tested, so the issue isn't with the dataset.

Have you tried indexing another time?

We have access to a Titan X somewhere. Let me see if someone can test on it.

danielfleischer commented 1 year ago

Hi, we were able to successfully create a searcher object (like in the demo jupyter code) that returns non-Nan scores on RTX 3090.

Could there be some Cuda dependencies that can only run on newer cards?

okhat commented 1 year ago

We were able to get normal behavior on Titan X. I have to assume your specific cuda setup on Titan X is different in some important way.

danielfleischer commented 1 year ago

Ok, thanks!