nmslib / hnswlib

Header-only C++/python library for fast approximate nearest neighbors
https://github.com/nmslib/hnswlib
Apache License 2.0
4.12k stars 609 forks source link

pip installs wrong binaries on macOS arm64 #454

Closed clstaudt closed 1 year ago

clstaudt commented 1 year ago

The pip install output indicates that there is a package for the platform macosx_11_0_arm64:

╰─ pip install hnswlib                                                                                                                             (trIAge) 
Collecting hnswlib
  Using cached hnswlib-0.7.0-cp310-cp310-macosx_11_0_arm64.whl
Requirement already satisfied: numpy in /Users/cls/miniforge3/envs/trIAge/lib/python3.10/site-packages (from hnswlib) (1.23.5)
Installing collected packages: hnswlib
Successfully installed hnswlib-0.7.0

However, the installed binaries are apparently not of the correct architecture:


---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[27], line 5
      3 temporary_file.flush()
      4 loader = TextLoader(temporary_file.name)
----> 5 docsearch = index_creator.from_loaders([loader])

File [~/miniforge3/envs/trIAge/lib/python3.10/site-packages/langchain/indexes/vectorstore.py:71](https://file+.vscode-resource.vscode-cdn.net/Users/cls/Documents/Work/Projects/PrototypeFund/Dev/trIAge/notebooks/~/miniforge3/envs/trIAge/lib/python3.10/site-packages/langchain/indexes/vectorstore.py:71), in VectorstoreIndexCreator.from_loaders(self, loaders)
     69     docs.extend(loader.load())
     70 sub_docs = self.text_splitter.split_documents(docs)
---> 71 vectorstore = self.vectorstore_cls.from_documents(
     72     sub_docs, self.embedding, **self.vectorstore_kwargs
     73 )
     74 return VectorStoreIndexWrapper(vectorstore=vectorstore)

File [~/miniforge3/envs/trIAge/lib/python3.10/site-packages/langchain/vectorstores/chroma.py:336](https://file+.vscode-resource.vscode-cdn.net/Users/cls/Documents/Work/Projects/PrototypeFund/Dev/trIAge/notebooks/~/miniforge3/envs/trIAge/lib/python3.10/site-packages/langchain/vectorstores/chroma.py:336), in Chroma.from_documents(cls, documents, embedding, ids, collection_name, persist_directory, client_settings, **kwargs)
    334 texts = [doc.page_content for doc in documents]
    335 metadatas = [doc.metadata for doc in documents]
--> 336 return cls.from_texts(
    337     texts=texts,
    338     embedding=embedding,
    339     metadatas=metadatas,
    340     ids=ids,
    341     collection_name=collection_name,
    342     persist_directory=persist_directory,
...
----> 6 import hnswlib
      7 from chromadb.db.index import Index
      8 from chromadb.errors import NoIndexException, InvalidDimensionException, NotEnoughElementsException

ImportError: dlopen([/Users/cls/miniforge3/envs/trIAge/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so](https://file+.vscode-resource.vscode-cdn.net/Users/cls/miniforge3/envs/trIAge/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so), 0x0002): tried: '[/Users/cls/miniforge3/envs/trIAge/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so](https://file+.vscode-resource.vscode-cdn.net/Users/cls/miniforge3/envs/trIAge/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so)' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '[/System/Volumes/Preboot/Cryptexes/OS/Users/cls/miniforge3/envs/trIAge/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so](https://file+.vscode-resource.vscode-cdn.net/System/Volumes/Preboot/Cryptexes/OS/Users/cls/miniforge3/envs/trIAge/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so)' (no such file), '[/Users/cls/miniforge3/envs/trIAge/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so](https://file+.vscode-resource.vscode-cdn.net/Users/cls/miniforge3/envs/trIAge/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so)' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))
``
yurymalkov commented 1 year ago

Hm. pip install hnswlib compiles the library on the user's machine. Does the error persist if you don't use conda?

clstaudt commented 1 year ago

Triggered the build step in a fresh environment (--no-cache-dir):

╰─ pip install --no-cache-dir hnswlib                                                                                   (hnswlib) 
Collecting hnswlib
  Downloading hnswlib-0.7.0.tar.gz (33 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy in ./miniforge3/envs/hnswlib/lib/python3.10/site-packages (from hnswlib) (1.24.2)
Building wheels for collected packages: hnswlib
  Building wheel for hnswlib (pyproject.toml) ... done
  Created wheel for hnswlib: filename=hnswlib-0.7.0-cp310-cp310-macosx_11_0_arm64.whl size=5739 sha256=594a7c8a6ce55bf37702f52ff0fcb53c48f4ce82c5d87809e4c1442b2ee08226
  Stored in directory: /private/var/folders/pl/9s2ysv_92pn6_2w7j2t40mh00000gn/T/pip-ephem-wheel-cache-56bbqmyd/wheels/8a/ae/ec/235a682e0041fbaeee389843670581ec6c66872db856dfa9a4
Successfully built hnswlib
Installing collected packages: hnswlib
Successfully installed hnswlib-0.7.0

Apparently this results in a mismatched binary:

╰─ python                                                                                                               (hnswlib) 
Python 3.10.10 | packaged by conda-forge | (main, Mar 24 2023, 20:12:31) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import hnswlib
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: dlopen(/Users/cls/miniforge3/envs/hnswlib/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so, 0x0002): tried: '/Users/cls/miniforge3/envs/hnswlib/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/cls/miniforge3/envs/hnswlib/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so' (no such file), '/Users/cls/miniforge3/envs/hnswlib/lib/python3.10/site-packages/hnswlib.cpython-310-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))
>>> 
clstaudt commented 1 year ago

Does the error persist if you don't use conda?

What exactly do you mean by not using conda?

yurymalkov commented 1 year ago

I looks like you are using conda. Conda sometimes can have a messed-up environment, especially if it is using its own compiler. If pip install hnswlib works with your default system python, it probably means there is something with the environment. You can also try creating a clean environment in conda by conda create --name myenv python=3.9

clstaudt commented 1 year ago

Been there done that. Conda is not to blame. After debugging my entire stack I realized that some settings somewhere in the terminal environment gave the compiler the impression that it should build for x86. After reinstalling everything, including homebrew and the shell, I managed to get a correct binary with pip install.

clstaudt commented 1 year ago

I have requested hnswlib to be built with conda-forge. This could also avoid the issue in the future.

bradydowling commented 1 year ago

Any additional information you can share about this would be helpful. I'm using pip to install hnswlib and haven't been able to find where it's getting the x86 version. I reinstalled brew, python, pip, and virtualenv multiple times, though I haven't reinstalled my shell entirely.

This happens to me whether I'm using a virtualenv or not.