deepset-ai / haystack-tutorials

Here you can find all the Tutorials for Haystack 📓
https://haystack.deepset.ai/tutorials
Apache License 2.0
231 stars 79 forks source link

Build a Scalable QA System: Elasticsearch pain points #233

Closed anakin87 closed 3 months ago

anakin87 commented 11 months ago

Build a Scalable Question Answering System is an important tutorial, because it guides users to transition from InMemoryDocumentStore to ElasticsearchDocumentStore for "production" use cases.

It runs fine on Colab, but when the users try to apply it in other environments, they encounter problems starting/connecting to Elasticsearch.

I list some problems I have encountered myself (on Ubuntu 22.04).

Without Docker

Using Docker

My personal ugly solution is running: sudo docker run -d -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" docker.elastic.co/elasticsearch/elasticsearch:7.17.6 But we can't expect Haystack beginners to do this.

What we can do

This tutorial may be fine if limited to the Colab environment. I would like to have a simple guide for users to run their Elasticsearch instance on Ubuntu, MacOS and Windows... (there is something similar in the docs, but I would make it more detailed and prominent.) If we produce a guide like this, we can simply link it in the tutorial.

(FYI @bilgeyucel @Timoeller @masci)