shainaraza commented 3 years ago

Question Put your question here How to use FastAPI, haystack with Colab Additional context Add any other context or screenshots about the question (optional).

FAQ Check

[ ] Have you had a look at our new FAQ page? Yes

I have this piece of code, and I am unable to have run haystack on Colab. There is no syntax error but fastAPI does not pick the data from pipeline. Any advise?

`!pip install fastapi nest-asyncio pyngrok uvicorn !pip install git+https://github.com/deepset-ai/haystack.git

In Colab / No Docker environments: Start Elasticsearch from source

! wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.2-linux-x86_64.tar.gz -q ! tar -xzf elasticsearch-7.9.2-linux-x86_64.tar.gz ! chown -R daemon:daemon elasticsearch-7.9.2

import os from subprocess import Popen, PIPE, STDOUT es_server = Popen(['elasticsearch-7.9.2/bin/elasticsearch'], stdout=PIPE, stderr=STDOUT, preexec_fn=lambda: os.setuid(1) # as daemon )

wait until ES has started

! sleep 15

from fastapi import FastAPI from haystack.document_store.elasticsearch import ElasticsearchDocumentStore from haystack.document_stores import ElasticsearchDocumentStore

from haystack.retriever.sparse import ElasticsearchRetriever from haystack.reader.farm import FARMReader from haystack.pipeline import ExtractiveQAPipeline

initialize doc store, retriever and reader components

DOC_STORE = ElasticsearchDocumentStore( host='localhost', username='', password='', index='aurelius' ) RETRIEVER = ElasticsearchRetriever(DOC_STORE) READER = FARMReader(model_name_or_path='deepset/bert-base-cased-squad2', context_window_size=1500, use_gpu=True)

initialize pipeline

PIPELINE = ExtractiveQAPipeline(reader=READER, retriever=RETRIEVER)

initialize API

APP = FastAPI()

@APP.get('/query') async def get_query(q: str, retriever_limit: int = 10, reader_limit: int = 3): """Makes query to doc store via Haystack pipeline.

:param q: Query string representing the question being asked.
:type q: str
"""
# get answers
return PIPELINE.run(query=q,
                    top_k_retriever=retriever_limit,
                    top_k_reader=reader_limit)

from pyngrok import ngrok

Terminate open tunnels if exist

ngrok.kill()

Setting the authtoken (optional)

Get your authtoken from https://dashboard.ngrok.com/auth

ngrok.set_auth_token(NGROK_AUTH_TOKEN)

ngrok_tunnel = ngrok.connect(9200) print('Public URL:', ngrok_tunnel.public_url) nest_asyncio.apply() uvicorn.run(APP ) `

Link to Colab notebook https://colab.research.google.com/drive/191cyC5eXajgekBwJKs4hKmAiC_WmHuQ_?usp=sharing

brandenchan commented 3 years ago

Hi @shainaraza , in the colab notebook that you provide, I don't see any line that handles writing documents into the document store. Our recommendation is that you index your documents using Haystack via the document_store.write_documents(docs) method. If you have an existing Elasticsearch Database that you would like to use with Haystack, you will have to ensure that the fields in ES are named in a specific way

ZanSara commented 2 years ago

Hello @shainaraza, did you find a solution to your problem in the end? If so, please let us know :slightly_smiling_face:

shainaraza commented 2 years ago

Yes @ZanSara I found, will update this thread.