Closed tholor closed 1 year ago
Awesome! This one I will be eagerly looking forward to be added to Haystack.
How about adding all such pipelines at here https://github.com/deepset-ai/haystack/tree/master/rest_api/pipeline, each will different file names like EQA.yaml, DR.yaml, GQA.yaml, RR.yaml etc. And user will be able to choose via config parameter. So user only need to update PIPELINE_YAML_PATH and get required pipeline with good defaults up and running. But ideally these will not be mutually exclusive means I could able to use same pipeline for Re-Ranking, EQA and Summarization task as well if in case each have few overlapping of model infrastructures. For rest APIs will depends on how many exists (final sink or output nodes) pipeline have and user need to provide which output he want which can decide which path user query will travel in pipeline.
Hey! I have been using ExtractiveQA Pipeline in a Flask Endpoint that I wrote. The latency is very high. It takes about more than 3 minutes to return a response. Below is my code.
Is there a way to make it faster?
import json
from flask import Flask
from flask import request
from flask import Response
from haystack.reader.farm import FARMReader
from haystack.pipeline import ExtractiveQAPipeline
from haystack.retriever.dense import DensePassageRetriever
from haystack.preprocessor.cleaning import clean_wiki_text
from haystack.document_store import ElasticsearchDocumentStore
from haystack.preprocessor.utils import convert_files_to_dicts
app = Flask(__name__)
document_store = ElasticsearchDocumentStore(host="localhost", username="", password="",
index="document", embedding_dim=768,
embedding_field="embedding")
document_store.delete_documents()
doc_dir = "Path to TXT Files"
dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, split_paragraphs=True)
document_store.write_documents(dicts)
retriever = DensePassageRetriever(
document_store=document_store,
query_embedding_model="facebook/dpr-question_encoder-single-nq-base",
passage_embedding_model="facebook/dpr-ctx_encoder-single-nq-base",
)
document_store.update_embeddings(retriever)
reader = FARMReader(model_name_or_path="ahotrod/albert_xxlargev1_squad2_512", use_gpu=True)
pipeline = ExtractiveQAPipeline(reader, retriever)
port = 6021
@app.route("/main_router/", methods=["GET"])
def get_predictions():
top_k_reader = 3
top_k_retriever = 3
query = request.args.get("q")
prediction = pipeline.run(query=query, top_k_retriever=top_k_retriever, top_k_reader=top_k_reader)
return Response(json.dumps(prediction))
if __name__ == "__main__":
app.run(port=port)
I think the bottleneck is pipeline.run().
Can you please suggest something to this issue?
Will be addressed as part of the work for v2.
The current REST API is mainly focused on Extractive QA pipelines. Let's extend the scope to:
For each pipeline we need to check: