huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
131.69k stars 26.22k forks source link

RagRetriever fails to find faiss-gpu installed with pip not conda #11720

Closed Berowne closed 3 years ago

Berowne commented 3 years ago

Information

Model I am using (RAG Retriever ...):

The problem arises when using: [ ] the official example scripts: (worked!) [ ] my own modified scripts: (give details below)

The tasks I am working on is: [* ] my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

  1. Run transformers/examples/research_projects/rag/use_own_knowledge_dataset.py This step worked fine yesterday prior to reboot.
  2. Try to inspect output dataset directly using RagRetriever model in python... 3.6 >
    from transformers import RagRetriever, RagSequenceForGeneration, RagTokenizer
    retriever = RagRetriever.from_pretrained('facebook/dpr-ctx_encoder-single-nq-base', cache_dir=cache_dir, index_name="custom", indexed_dataset='./rag/out')
    ImportError: 
    RagRetriever requires the faiss library but it was not found in your environment. Checkout the instructions on the
    installation page of its repo: https://github.com/facebookresearch/faiss/blob/master/INSTALL.md and follow the ones
    that match your environment.

Also, if you import faiss, then faiss.version does not exist.

Note for our environment we have to pip install faiss-gpu rather than conda since conda repos are blocked at proxy. qds/NLP/aws_nlp/rag/out

A sample script to query the /path/to/my_knowledge_dataset/ would be handy.

Berowne commented 3 years ago

Got it to work with rebuild... and pip install faiss and faiss-gpu git clone https://...github.../rag export TOKENIZERS_PARALLELISM=false pip install torch torchvision ray[default] datasets faiss faiss-gpu matplotlib seaborn pandas transformers awscli s3fs scikit-plot python use_own_knowledge_dataset.py --csv_path ./text.csv --output_dir ./out/text

amyeroberts commented 4 months ago

@Divyansh-Gemini This looks more like a pip issue, as faiss is available on pypi. I'd suggest upgrading pip and trying again:

pip install --upgrade pip
pip install faiss
Divyansh-Gemini commented 4 months ago

Thanks @amyeroberts for your response. The issue has been resolved now. I was installing faiss, but faiss-cpu was needed to be installed.