AnswerDotAI / RAGatouille

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
Apache License 2.0
2.83k stars 197 forks source link

Hangs on #> Saving the indexing plan to /... #119

Closed stephenbyrne99 closed 6 months ago

stephenbyrne99 commented 7 months ago

Having an issue when using GPU where it hangs on saving the index plan

hangs here: [Feb 06, 20:31:34] [0] # of sampled PIDs = 519 sampled_pids[:3] = [426, 10, 305] Expand [Feb 06, 20:31:34] [0] #> Encoding 519 passages.. [Feb 06, 20:31:40] [0] avg_doclen_est = 131.91522216796875 len(local_sample) = 519 [Feb 06, 20:31:40] [0] Creating 4,096 partitions. [Feb 06, 20:31:40] [0] *Estimated* 68,464 embeddings. [Feb 06, 20:31:40] [0] #> Saving the indexing plan to /dir/plan.json ..

self.rag.index( collection=entry_texts, document_ids=entry_ids, document_metadatas=entry_metadatas, index_name="name", max_document_length=180, split_documents=True )

Works without GPU fine

image: nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04 python version 3.9 confirmed cuda available

no additional logs with "COLBERT_LOAD_TORCH_EXTENSION_VERBOSE": "True"

bclavie commented 7 months ago

Hey, thank you for the report. Could you give us the full output of your pip freeze? This seems to be happening to a small subset of users and we're trying to figure out exactly what the environment has in common in every case.

cc @Anmol6, related to the random hangup issues.

stephenbyrne99 commented 7 months ago

sure, thanks for taking a look @bclavie

aiohttp==3.9.1 aiosignal==1.3.1 aiostream==0.4.4 anyio==4.2.0 asgiref==3.5.2 asttokens==2.4.1 async-timeout==4.0.3 attrs==23.2.0 bitarray==2.9.2 blinker==1.7.0 bytecode==0.15.1 catalogue==2.0.10 cattrs==23.2.3 certifi==2023.11.17 charset-normalizer==2.1.1 click==8.1.7 cloudpickle==2.0.0 colbert-ai==0.2.18 commonmark==0.9.1 dataclasses-json==0.6.4 datasets==2.16.1 ddsketch==2.0.4 ddtrace==1.5.2 decorator==5.1.1 Deprecated==1.2.14 dill==0.3.7 dirtyjson==1.0.8 distro==1.9.0 envier==0.5.0 exceptiongroup==1.2.0 executing==2.0.1 faiss-gpu==1.7.2 fastapi==0.88.0 fastprogress==1.0.0 filelock==3.13.1 Flask==3.0.2 frozenlist==1.4.1 fsspec==2023.10.0 git-python==1.0.3 gitdb==4.0.11 GitPython==3.1.41 greenlet==3.0.3 grpclib==0.4.3 h11==0.14.0 h2==4.1.0 hpack==4.0.0 httpcore==1.0.2 httpx==0.26.0 huggingface-hub==0.20.3 hyperframe==6.0.1 idna==3.6 importlib-metadata==4.8.1 ipython==8.18.1 itsdangerous==2.1.2 jedi==0.19.1 Jinja2==3.1.3 joblib==1.3.2 jsonpatch==1.33 jsonpointer==2.4 jsonschema==4.20.0 jsonschema-specifications==2023.12.1 langchain==0.1.5 langchain-community==0.0.18 langchain-core==0.1.19 langsmith==0.0.87 llama-index==0.9.45 MarkupSafe==2.1.5 marshmallow==3.20.2 matplotlib-inline==0.1.6 modal==0.56.4909 mpmath==1.3.0 multidict==6.0.4 multiprocess==0.70.15 mypy-extensions==1.0.0 nest-asyncio==1.6.0 networkx==3.2.1 ninja==1.11.1.1 nltk==3.8.1 numpy==1.26.3 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-nccl-cu12==2.19.3 nvidia-nvjitlink-cu12==12.3.101 nvidia-nvtx-cu12==12.1.105 onnx==1.15.0 openai==1.11.1 packaging==23.2 pandas==2.2.0 parso==0.8.3 pexpect==4.9.0 pillow==10.2.0 prompt-toolkit==3.0.43 protobuf==4.25.2 ptyprocess==0.7.0 pure-eval==0.2.2 pyarrow==15.0.0 pyarrow-hotfix==0.6 pydantic==1.10.13 Pygments==2.17.2 python-dateutil==2.8.2 python-dotenv==1.0.1 python-multipart==0.0.6 pytz==2024.1 PyYAML==6.0.1 RAGatouille==0.0.6b5 referencing==0.32.1 regex==2023.12.25 requests==2.31.0 rich==12.3.0 rpds-py==0.17.1 ruff==0.1.15 safetensors==0.4.2 scikit-learn==1.4.0 scipy==1.12.0 sentence-transformers==2.3.1 sentencepiece==0.1.99 six==1.16.0 smmap==5.0.1 sniffio==1.3.0 SQLAlchemy==2.0.25 srsly==2.4.8 stack-data==0.6.3 starlette==0.22.0 sympy==1.12 tblib==1.7.0 tenacity==8.2.3 threadpoolctl==3.2.0 tiktoken==0.5.2 tokenizers==0.15.1 toml==0.10.2 torch==2.2.0 tqdm==4.66.1 traitlets==5.14.1 transformers==4.37.2 triton==2.2.0 typeguard==4.1.5 typer==0.6.1 types-certifi==2021.10.8.3 types-toml==0.10.4 typing-inspect==0.9.0 typing_extensions==4.9.0 tzdata==2023.4 ujson==5.9.0 urllib3==2.2.0 voyager==2.0.2 wcwidth==0.2.13 Werkzeug==3.0.1 wrapt==1.16.0 xmltodict==0.13.0 xxhash==3.4.1 yarl==1.9.4 zipp==3.17.0

also getting an issue now with it hanging on CPU (with separate dependencies to above in a separate project, locally on my Mac vs deployed on docker) deps for this also below:

aiohttp==3.9.1 aiosignal==1.3.1 alembic==1.13.1 annotated-types==0.6.0 anyio==4.2.0 async-timeout==4.0.3 attrs==23.2.0 backoff==2.2.1 bitarray==2.9.2 blinker==1.7.0 catalogue==2.0.10 certifi==2024.2.2 charset-normalizer==3.3.2 click==8.1.7 colbert-ai==0.2.18 colorlog==6.8.2 dataclasses-json==0.6.4 datasets==2.14.7 Deprecated==1.2.14 dill==0.3.7 dirtyjson==1.0.8 distro==1.9.0 dspy-ai @ git+https://github.com/stephenbyrne99/dspy.git@c189f3bd317dffece4598ef5749bb58208e22f86 exceptiongroup==1.2.0 faiss-cpu==1.7.4 filelock==3.13.1 Flask==3.0.2 frozenlist==1.4.1 fsspec==2023.10.0 git-python==1.0.3 gitdb==4.0.11 GitPython==3.1.41 greenlet==3.0.3 h11==0.14.0 html2text==2020.1.16 httpcore==1.0.2 httpx==0.26.0 huggingface-hub==0.20.3 idna==3.6 itsdangerous==2.1.2 Jinja2==3.1.3 joblib==1.3.2 jsonpatch==1.33 jsonpointer==2.4 langchain==0.1.5 langchain-community==0.0.18 langchain-core==0.1.19 langsmith==0.0.87 llama-hub==0.0.78 llama-index==0.9.45 Mako==1.3.2 MarkupSafe==2.1.5 marshmallow==3.20.2 mpmath==1.3.0 multidict==6.0.5 multiprocess==0.70.15 mypy-extensions==1.0.0 nest-asyncio==1.6.0 networkx==3.2.1 ninja==1.11.1.1 nltk==3.8.1 numpy==1.26.4 onnx==1.15.0 openai==1.11.1 optuna==3.4.0 packaging==23.2 pandas==2.1.4 pillow==10.2.0 protobuf==4.25.2 psutil==5.9.8 pyaml==23.12.0 pyarrow==15.0.0 pyarrow-hotfix==0.6 pydantic==2.6.1 pydantic_core==2.16.2 python-dateutil==2.8.2 python-dotenv==1.0.1 pytz==2024.1 PyYAML==6.0.1 RAGatouille==0.0.6b5 regex==2023.10.3 requests==2.31.0 retrying==1.3.4 ruff==0.1.15 safetensors==0.4.2 scikit-learn==1.4.0 scipy==1.12.0 sentence-transformers==2.3.1 sentencepiece==0.1.99 six==1.16.0 smmap==5.0.1 sniffio==1.3.0 SQLAlchemy==2.0.25 srsly==2.4.8 sympy==1.12 tenacity==8.2.3 threadpoolctl==3.2.0 tiktoken==0.5.2 tokenizers==0.15.1 torch==2.2.0 tqdm==4.66.1 transformers==4.37.2 typing-inspect==0.9.0 typing_extensions==4.9.0 tzdata==2023.4 ujson==5.8.0 urllib3==2.2.0 voyager==2.0.2 Werkzeug==3.0.1 wrapt==1.16.0 xxhash==3.4.1 yarl==1.9.4

Will dive into it myself also, let me know if any questions or need anymore info

inteoryx commented 7 months ago

I am also running into this problem. I have tried this on several machines on vast.ai and am just following the first example.

Code I am running to produce the result:

from ragatouille import RAGPretrainedModel

RAG = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0", index_root="./index")

full_document = get_wikipedia_page("Hayao_Miyazaki")

RAG.index(
    collection=[full_document], 
    document_ids=['miyazaki'],
    document_metadatas=[{"entity": "person", "source": "wikipedia"}],
    index_name="Miyazaki", 
    max_document_length=180, 
    split_documents=True
)

And the output is...


[Feb 09, 07:33:43] [0]       #> Encoding 81 passages..
[Feb 09, 07:33:45] [0]       avg_doclen_est = 129.82716369628906     len(local_sample) = 81
[Feb 09, 07:33:45] [0]       Creating 1,024 partitions.
[Feb 09, 07:33:45] [0]       *Estimated* 10,516 embeddings.
[Feb 09, 07:33:45] [0]       #> Saving the indexing plan to ./index/colbert/indexes/Miyazaki/plan.json ..

At this point it will just hang as long as I leave it.

Here is my pip freeze


aiosignal==1.3.1
annotated-types==0.6.0
anyio==4.2.0
archspec @ file:///croot/archspec_1697725767277/work
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens @ file:///opt/conda/conda-bld/asttokens_1646925590279/work
astunparse==1.6.3
async-lru==2.0.4
async-timeout==4.0.3
attrs @ file:///croot/attrs_1695717823297/work
Babel==2.14.0
bash_kernel==0.9.3
beautifulsoup4 @ file:///croot/beautifulsoup4-split_1681493039619/work
bitarray==2.9.2
bleach==6.1.0
blinker==1.7.0
boltons @ file:///croot/boltons_1677628692245/work
Brotli @ file:///tmp/abs_ecyw11_7ze/croots/recipe/brotli-split_1659616059936/work
catalogue==2.0.10
certifi @ file:///croot/certifi_1700501669400/work/certifi
cffi @ file:///croot/cffi_1700254295673/work
chardet @ file:///home/builder/ci_310/chardet_1640804867535/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
click @ file:///croot/click_1698129812380/work
colbert-ai==0.2.18
comm==0.2.1
conda @ file:///croot/conda_1696257509808/work
conda-build @ file:///croot/conda-build_1705600620173/work
conda-content-trust @ file:///croot/conda-content-trust_1693490622020/work
conda-libmamba-solver @ file:///croot/conda-libmamba-solver_1691418897561/work/src
conda-package-handling @ file:///croot/conda-package-handling_1690999929514/work
conda_index @ file:///croot/conda-index_1706633791028/work
conda_package_streaming @ file:///croot/conda-package-streaming_1690987966409/work
cryptography @ file:///croot/cryptography_1702070282333/work
dataclasses-json==0.6.4
datasets==2.16.1
debugpy==1.8.0
decorator @ file:///opt/conda/conda-bld/decorator_1643638310831/work
defusedxml==0.7.1
Deprecated==1.2.14
dill==0.3.7
dirtyjson==1.0.8
distro @ file:///croot/distro_1701455004953/work
dnspython==2.5.0
exceptiongroup @ file:///croot/exceptiongroup_1706031385326/work
executing @ file:///opt/conda/conda-bld/executing_1646925071911/work
expecttest==0.2.1
faiss-gpu==1.7.2
fastjsonschema==2.19.1
filelock @ file:///croot/filelock_1700591183607/work
Flask==3.0.2
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2023.10.0
git-python==1.0.3
gitdb==4.0.11
GitPython==3.1.41
gmpy2 @ file:///tmp/build/80754af9/gmpy2_1645455533097/work
greenlet==3.0.3
h11==0.14.0
httpcore==1.0.2
httpx==0.26.0
huggingface-hub==0.20.3
hypothesis==6.97.3
idna @ file:///croot/idna_1666125576474/work
iniconfig==2.0.0
ipykernel==6.29.0
ipython @ file:///croot/ipython_1704833016303/work
ipywidgets==8.1.1
isoduration==20.11.0
itsdangerous==2.1.2
jedi @ file:///tmp/build/80754af9/jedi_1644315229345/work
Jinja2 @ file:///croot/jinja2_1666908132255/work
joblib==1.3.2
json5==0.9.14
jsonpatch==1.33
jsonpointer==2.1
jsonschema @ file:///croot/jsonschema_1699041609003/work
jsonschema-specifications @ file:///croot/jsonschema-specifications_1699032386549/work
jupyter==1.0.0
jupyter-archive==3.4.0
jupyter-console==6.6.3
jupyter-events==0.9.0
jupyter-http-over-ws==0.0.8
jupyter-lsp==2.2.2
jupyter_client==8.6.0
jupyter_core==5.7.1
jupyter_server==2.12.5
jupyter_server_terminals==0.5.2
jupyterlab==4.0.12
jupyterlab-widgets==3.0.9
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.2
langchain==0.1.6
langchain-community==0.0.19
langchain-core==0.1.22
langsmith==0.0.87
libarchive-c @ file:///tmp/build/80754af9/python-libarchive-c_1617780486945/work
libmambapy @ file:///croot/mamba-split_1698782620632/work/libmambapy
llama-index==0.9.46
MarkupSafe @ file:///croot/markupsafe_1704205993651/work
marshmallow==3.20.2
matplotlib-inline @ file:///opt/conda/conda-bld/matplotlib-inline_1662014470464/work
menuinst @ file:///croot/menuinst_1702390294373/work
mistune==3.0.2
mkl-fft @ file:///croot/mkl_fft_1695058164594/work
mkl-random @ file:///croot/mkl_random_1695059800811/work
mkl-service==2.4.0
more-itertools @ file:///croot/more-itertools_1700662129964/work
mpmath @ file:///croot/mpmath_1690848262763/work
multidict==6.0.5
multiprocess==0.70.15
mypy-extensions==1.0.0
nbclient==0.9.0
nbconvert==7.14.2
nbformat==5.9.2
nbzip==0.1.0
nest-asyncio==1.6.0
networkx @ file:///croot/networkx_1690561992265/work
ninja==1.11.1.1
nltk==3.8.1
notebook==7.0.7
notebook_shim==0.2.3
numpy @ file:///croot/numpy_and_numpy_base_1704311704800/work/dist/numpy-1.26.3-cp310-cp310-linux_x86_64.whl#sha256=a281f24b826e51f1c25bdd24960ab44b4bc294c65d81560441ba7fffd8ddd2a7
onnx==1.15.0
openai==1.12.0
optree==0.10.0
overrides==7.7.0
packaging==23.2
pandas==2.2.0
pandocfilters==1.5.1
parso @ file:///opt/conda/conda-bld/parso_1641458642106/work
pexpect @ file:///tmp/build/80754af9/pexpect_1605563209008/work
Pillow @ file:///croot/pillow_1696580024257/work
pkginfo @ file:///croot/pkginfo_1679431160147/work
platformdirs @ file:///croot/platformdirs_1692205439124/work
pluggy==1.4.0
prometheus-client==0.19.0
prompt-toolkit @ file:///croot/prompt-toolkit_1704404351921/work
protobuf==4.25.2
psutil @ file:///opt/conda/conda-bld/psutil_1656431268089/work
ptyprocess @ file:///tmp/build/80754af9/ptyprocess_1609355006118/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl
pure-eval @ file:///opt/conda/conda-bld/pure_eval_1646925070566/work
pyarrow==15.0.0
pyarrow-hotfix==0.6
pycosat @ file:///croot/pycosat_1696536503704/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pydantic==2.6.1
pydantic_core==2.16.2
Pygments @ file:///croot/pygments_1684279966437/work
pyOpenSSL @ file:///croot/pyopenssl_1690223430423/work
PySocks @ file:///home/builder/ci_310/pysocks_1640793678128/work
pytest==8.0.0
python-dateutil==2.8.2
python-dotenv==1.0.1
python-etcd==0.4.5
python-json-logger==2.0.7
pytz @ file:///croot/pytz_1695131579487/work
PyYAML @ file:///croot/pyyaml_1698096049011/work
pyzmq==25.1.2
qtconsole==5.5.1
QtPy==2.4.1
RAGatouille==0.0.6rc1
referencing @ file:///croot/referencing_1699012038513/work
regex==2023.12.25
requests @ file:///croot/requests_1690400202158/work
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py @ file:///croot/rpds-py_1698945930462/work
ruamel.yaml @ file:///croot/ruamel.yaml_1666304550667/work
ruamel.yaml.clib @ file:///croot/ruamel.yaml.clib_1666302247304/work
ruff==0.1.15
safetensors==0.4.2
scikit-learn==1.4.0
scipy==1.12.0
Send2Trash==1.8.2
sentence-transformers==2.3.1
sentencepiece==0.1.99
six @ file:///tmp/build/80754af9/six_1644875935023/work
smmap==5.0.1
sniffio==1.3.0
sortedcontainers==2.4.0
soupsieve @ file:///croot/soupsieve_1696347547217/work
SQLAlchemy==2.0.25
srsly==2.4.8
stack-data @ file:///opt/conda/conda-bld/stack_data_1646927590127/work
sympy @ file:///croot/sympy_1701397643339/work
tenacity==8.2.3
terminado==0.18.0
threadpoolctl==3.2.0
tiktoken==0.6.0
tinycss2==1.2.1
tokenizers==0.15.1
tomli @ file:///opt/conda/conda-bld/tomli_1657175507142/work
toolz @ file:///croot/toolz_1667464077321/work
torch==2.2.0
torchaudio==2.2.0
torchelastic==0.2.2
torchvision==0.17.0
tornado==6.4
tqdm @ file:///croot/tqdm_1679561862951/work
traitlets @ file:///croot/traitlets_1671143879854/work
transformers==4.37.2
triton==2.2.0
truststore @ file:///croot/truststore_1695244293384/work
types-dataclasses==0.6.6
types-python-dateutil==2.8.19.20240106
typing-inspect==0.9.0
typing_extensions @ file:///croot/typing_extensions_1705599297034/work
tzdata==2023.4
ujson==5.9.0
uri-template==1.3.0
urllib3 @ file:///croot/urllib3_1698257533958/work
voyager==2.0.2
wcwidth @ file:///Users/ktietz/demo/mc3/conda-bld/wcwidth_1629357192024/work
webcolors==1.13
webencodings==0.5.1
websocket-client==1.7.0
Werkzeug==3.0.1
widgetsnbextension==4.0.9
wrapt==1.16.0
xxhash==3.4.1
yarl==1.9.4
zstandard @ file:///croot/zstandard_1677013143055/work```
franperic commented 7 months ago

I have the same issue.

Running the indexing example on the README locally on my Mac.

[Feb 09, 10:38:44] [0]           avg_doclen_est = 187.74490356445312     len(local_sample) = 98
[Feb 09, 10:38:44] [0]           Creating 2,048 partitions.
[Feb 09, 10:38:44] [0]           *Estimated* 18,399 embeddings.
[Feb 09, 10:38:44] [0]           #> Saving the indexing plan to .ragatouille/colbert/indexes/my_index/plan.json ..
WARNING clustering 17480 points to 2048 centroids: please provide at least 79872 training points
Clustering 17480 points in 128D to 2048 clusters, redo 1 times, 20 iterations
  Preprocessing in 0.00 s
  Iteration 5 (0.20 s, search 0.19 s): objective=2953.7 imbalance=1.465 nsplit=0   
franperic commented 7 months ago

I just switched from python 3.9 to 3.11 and it works for me now.

fblissjr commented 7 months ago

If anyone else runs into this issue with faiss-gpu (I did on wsl2), ensure you're using the conda install from nightly pytorch (or build from source).

NohTow commented 7 months ago

Hello, I am running into the same issue. @franperic, switching to Python 3.11 did not help :( Could you check if you are indeed using faiss-gpu? Since pip only supports faiss-gpu up to Python 3.10 but is ok with faiss-cpu for Python 3.11

@fblissjr, could you elaborate on the install commands you used? I tried using nightly but it still does not work.

bclavie commented 7 months ago

(Copy/pasting this message in a few related issues)

Hey guys!

Thanks a lot for bearing with me as I juggle everything and trying to diagnose this. It’s complicated to fix with relatively little time to dedicate to it, as it seems like the dependencies causing issues aren’t the same for everyone, with no clear platform pattern as of yet. Overall, the issues center around the usual suspects of faiss and CUDA.

While because of this I can’t fix the issue with PLAID optimised indices just yet, I’m also noticing that most of the bug reports here are about relatively small collections (100s-to-low-1000s). To lower the barrier to entry as much as possible, https://github.com/bclavie/RAGatouille/pull/137 is introducing a second index format, which doesn’t actually build an index, but performs an exact search over all documents (as a stepping stone towards https://github.com/bclavie/RAGatouille/issues/110, which would use an HNSW index to be an in-between compromise between PLAID optimisation and exact search). This approach doesn’t scale, but offers the best possible search accuracy & is still performed in a few hundred milliseconds at most for small collections. Ideally, it’ll also open up the way to shipping lower-dependency versions (https://github.com/bclavie/RAGatouille/issues/136)

The PR above (https://github.com/bclavie/RAGatouille/pull/137) is still a work in progress, as it needs CRUD support, tests, documentation, better precision routing (fp32/bfloat16) etc… (and potentially searching only subset of document ids). However, it’s working in a rough state for me locally. If you’d like to give it a try (with the caveat that it might very well break!), please feel free to install the library directly from the feat/full_vectors_indexing branch and adding the following argument to your index() call:

index(…
index_type=“FULL_VECTORS”,
)

Any feedback is appreciated, as always, and thanks again!

PulkitAgr113 commented 7 months ago

If anyone else runs into this issue with faiss-gpu (I did on wsl2), ensure you're using the conda install from nightly pytorch (or build from source).

This worked for me (tested only with Python 3.10), both for training and indexing/searching. This is the command I used:

conda install -c pytorch/label/nightly -c nvidia -y faiss-gpu=1.7.4
deichrenner commented 6 months ago

(Copy/pasting this message in a few related issues)

Hey guys!

Thanks a lot for bearing with me as I juggle everything and trying to diagnose this. It’s complicated to fix with relatively little time to dedicate to it, as it seems like the dependencies causing issues aren’t the same for everyone, with no clear platform pattern as of yet. Overall, the issues center around the usual suspects of faiss and CUDA.

While because of this I can’t fix the issue with PLAID optimised indices just yet, I’m also noticing that most of the bug reports here are about relatively small collections (100s-to-low-1000s). To lower the barrier to entry as much as possible, #137 is introducing a second index format, which doesn’t actually build an index, but performs an exact search over all documents (as a stepping stone towards #110, which would use an HNSW index to be an in-between compromise between PLAID optimisation and exact search). This approach doesn’t scale, but offers the best possible search accuracy & is still performed in a few hundred milliseconds at most for small collections. Ideally, it’ll also open up the way to shipping lower-dependency versions (#136)

The PR above (#137) is still a work in progress, as it needs CRUD support, tests, documentation, better precision routing (fp32/bfloat16) etc… (and potentially searching only subset of document ids). However, it’s working in a rough state for me locally. If you’d like to give it a try (with the caveat that it might very well break!), please feel free to install the library directly from the feat/full_vectors_indexing branch and adding the following argument to your index() call:

index(…
index_type=“FULL_VECTORS”,
)

Any feedback is appreciated, as always, and thanks again!

Thanks @bclavie for providing this fix. It is effective on my M2 on python 3.9.

bclavie commented 6 months ago

Hey all. This should FINALLY be fixed, or at least heavily alleviated, by the new experimental default indexing in 0.0.8, which skips using faiss (does K-means in pure pytorch) as long as you're indexing fewer than ~100k documents! @NohTow @fblissjr @deichrenner @inteoryx @stephenbyrne99.

Thanks again for the help in diagnosing too -- took a while to realise a lot of the unrelated bugs were different flavours of faiss hangups in most cases 🥲