stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
MIT License
2.68k stars 355 forks source link

[c++] Calling pop() on an empty stack causes "undefined behavior" #259

Closed jessiejuachon closed 8 months ago

jessiejuachon commented 9 months ago

In filter_pids.cpp, calling global_approx_scores.pop() on an empty stack sometimes causes the searcher to hang. The behavior is intermittent and depends on how the default compiler handles "undefined behavior" which is what happens when pop() is called on an empty stack.

Steps to replicate :

  1. Initialize the IndexUpdater
  2. Call IndexUpdater.add to index some passages.
  3. Call IndexUpdater.persist_to_disk, updating the index files.
  4. Re-initialize the IndexUpdater
  5. Call IndexUpdater.search Result: The application hangs after 1 to 2 calls to IndexUpdater.search. Expected result: IndexUpdater returns search results.

I used a couple of test environments with the following c++ versions:

Using built-in specs. COLLECT_GCC=c++ COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/8/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Thread model: posix gcc version 8.4.0 (Ubuntu 8.4.0-1ubuntu1~18.04)

Apple clang version 15.0.0 (clang-1500.0.40.1) Target: x86_64-apple-darwin22.6.0 Thread model: posix InstalledDir: /Library/Developer/CommandLineTools/usr/bin