microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.51k stars 3.17k forks source link

Python: Support for FAISS in Python #4130

Open juliomenendez opened 9 months ago

juliomenendez commented 9 months ago

Add support for FAISS https://faiss.ai/

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning.

https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/

xiazhengtao commented 9 months ago

To add support for FAISS in your Python project, you need to follow these general steps:

  1. Install FAISS:

    • You can install FAISS using the following command:
      pip install faiss-cpu  # for CPU version

      For GPU support, you can install faiss-gpu. Make sure you have the necessary dependencies installed for GPU support.

  2. Import FAISS in Your Python Script:

    • Once installed, you can import FAISS in your Python script:
      import faiss
  3. Use FAISS Functionality:

    • FAISS provides various functions and classes for similarity search and clustering. Refer to the FAISS documentation (https://faiss.ai/) and the specific documentation for the version you installed to understand how to use its features.
    • Example of using FAISS for similarity search:
      # Assuming you have a set of vectors 'x' and a query vector 'q'
      index = faiss.IndexFlatL2(x.shape[1])  # L2 distance index
      index.add(x)
      D, I = index.search(q, k=5)  # Search for the 5 nearest neighbors
  4. Integrate FAISS into Your Project:

    • Incorporate FAISS functionality into your existing project. Depending on your use case, this might involve loading pre-trained models, handling vectors, and integrating FAISS calls into your application logic.
  5. Test and Optimize:

    • Test the integration thoroughly to ensure that FAISS works as expected in your project.
    • Optimize parameters and configurations based on your specific use case.
  6. Handle Dependencies and Versions:

    • Be mindful of FAISS dependencies and versions. Ensure compatibility with other libraries or frameworks used in your project.

Here's a very basic example using FAISS for similarity search:

import faiss
import numpy as np

# Generate some random data for demonstration purposes
d = 64  # dimension
nb = 100000  # number of vectors
np.random.seed(123)
xb = np.random.random((nb, d)).astype('float32')

# Build the FAISS index
index = faiss.IndexFlatL2(d)
index.add(xb)

# Query for the nearest neighbors
k = 5
xq = np.random.random((1, d)).astype('float32')
D, I = index.search(xq, k)

print("Query vector:")
print(xq)
print("\nNearest neighbors:")
print(xb[I[0]])
print("\nDistances:")
print(D[0])

Remember to adapt these steps to your specific project structure and requirements.

moonbox3 commented 2 months ago

@eavanvalkenburg would this be something we can work on after your new memory changes go in?

eavanvalkenburg commented 2 months ago

absolutely @moonbox3