Question on FAQ Retrieval Tutorial -- with FAISS I get different performance

AMChierici commented 3 years ago

Question Hi, I went through the FAQ tutorial, and I applied it to a dataset I am working on (it's public and can be downloaded here for reproducibility of the issue). I want to use FAISS because I will mainly do dense similarity retrieval. However, the implementation with FAISS gives me a much lower performance than the implementation with ElasticSearch. Why is this the case?

My understanding is that the embeddings are computed using the retriever, which doesn't vary according to which document store I pick. Embeddings of query and questions should be the same, and the similarity computation the same. Is that right?

Additional context You can check my example in this notebook. Please note in this notebook I am using InMeemoryDocumentStore instead of FAISS because I needed to save the index as a pickle file, and the FAISS format didn't allow me to. But if I change the document store to FAISS, the performance is the same.

I replaced this code block from the original tutorial:

document_store = ElasticsearchDocumentStore(host="localhost", username="", password="",
                                            index="document",
                                            embedding_field="question_emb",
                                            embedding_dim=768,
                                            excluded_meta_data=["question_emb"])

With this one:

document_store = InMemoryDocumentStore(similarity="cosine")

Then I also had to change finder.get_answers_via_similar_questions for retrieving answers because it was returning an empty list. I used document_store.query_by_embedding instead (for both the FAISS and the InMemory document stores).

When I compute the no. of hits @1 and @10 (hits within the first one and ten top retrieved answers, respectively), I get 0.4411764705882353 0.6058823529411764 with the InMemory document store (and FAISS too, same numbers).

Instead, I get much better results with the Elastic Search document store. Namely, 0.58 and 0.71.

tholor commented 3 years ago

@AMChierici Thanks for reporting. We will have a look. @julian-risch Can you please try to replicate and understand the root cause? Might be related to #672 .

julian-risch commented 3 years ago

@AMChierici Thank you for providing code and data to replicate this issue. Much appreciated. You are right that we would expect the same performance form the different document stores (FAISS and ElasticSearch) because they use the same document similarity measure. Let me have a look into it.

julian-risch commented 3 years ago

Hello @AMChierici and first of all sorry for taking so long to get back to you. I think the problem is about different settings for the similarity metric. With elasticsearch you need to use similarity = "cosine" and with the other document stores you need to use similarity = "dot_product" to have a fair comparison with the same settings. Sorry for the confusion. Please try to use FAISSDocumentStore(similarity="dot_product") and see whether you get the expected results.

AMChierici commented 3 years ago

Thank you, @julian-risch . If I understand correctly, similarity="dot_product" is the default in FAISSDocumentStore and I can't use similarity="cosine" in FAISSDocumentStore.

I realized the error was another one. When I used the doc store, it connected automatically to another MySQL DB that was already available with some repeated questions and answers. Therefore the metric was giving me different outcomes.

Could you please clarify if my understanding of the first paragraphs was correct?

julian-risch commented 3 years ago

Your understanding of the FAISSDocumentStore is correct: similarity="dot_product" is the default and other similarity metrics are currently not supported. Happy to hear that you were able to resolve the issue @AMChierici . Thanks for letting us know. I will close this issue then.

deepset-ai / haystack

Question on FAQ Retrieval Tutorial -- with FAISS I get different performance #747