stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
MIT License
2.67k stars 355 forks source link

troubleshooting encoding performance #301

Open jbellis opened 5 months ago

jbellis commented 5 months ago

I'm trying to do low level encoding so I can add the vectors to my own index:

        cf = ColBERTConfig(checkpoint='checkpoints/colbertv2.0')
        cp = Checkpoint(cf.checkpoint, colbert_config=cf)
        encoder = CollectionEncoder(cf, cp)
        passages = ...
        encoder.encode_passages(passages)

this works, but it is slow and nvidia-smi says the gpu is almost entirely idle (1%-5% util), even if I spin up multiple threads (each with their own encoder of course). Is this expected?

I do see

>>> torch.cuda.is_available()
True

but that's about the extent of my troubleshooting knowledge.

devinbost commented 4 months ago

Few questions:

  1. Have you tried using the PyTorch data profiler? https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html I'd probably start there.

  2. How are you loading the data? It looks like your dataset is loaded from memory, but I want to confirm there's not an issue with the loading step. PyTorch has specific classes:

  3. What value are you setting for index_bsize ? You probably want to increase this value until it breaks and then bring it back down. If data transfers are frequently going back and forth between the CPU and GPU, that will bottleneck a lot of GPU processing.