deepset-ai / haystack

:mag: AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
https://haystack.deepset.ai
Apache License 2.0
17.57k stars 1.91k forks source link

Support more pipeline types in REST API #1234

Closed tholor closed 1 year ago

tholor commented 3 years ago

The current REST API is mainly focused on Extractive QA pipelines. Let's extend the scope to:

For each pipeline we need to check:

lalitpagaria commented 3 years ago

Awesome! This one I will be eagerly looking forward to be added to Haystack.

How about adding all such pipelines at here https://github.com/deepset-ai/haystack/tree/master/rest_api/pipeline, each will different file names like EQA.yaml, DR.yaml, GQA.yaml, RR.yaml etc. And user will be able to choose via config parameter. So user only need to update PIPELINE_YAML_PATH and get required pipeline with good defaults up and running. But ideally these will not be mutually exclusive means I could able to use same pipeline for Re-Ranking, EQA and Summarization task as well if in case each have few overlapping of model infrastructures. For rest APIs will depends on how many exists (final sink or output nodes) pipeline have and user need to provide which output he want which can decide which path user query will travel in pipeline.

VETURISRIRAM commented 3 years ago

Hey! I have been using ExtractiveQA Pipeline in a Flask Endpoint that I wrote. The latency is very high. It takes about more than 3 minutes to return a response. Below is my code.

Is there a way to make it faster?

import json
from flask import Flask
from flask import request
from flask import Response
from haystack.reader.farm import FARMReader
from haystack.pipeline import ExtractiveQAPipeline
from haystack.retriever.dense import DensePassageRetriever
from haystack.preprocessor.cleaning import clean_wiki_text
from haystack.document_store import ElasticsearchDocumentStore
from haystack.preprocessor.utils import convert_files_to_dicts

app = Flask(__name__)

document_store = ElasticsearchDocumentStore(host="localhost", username="", password="",
                                            index="document", embedding_dim=768,
                                            embedding_field="embedding")
document_store.delete_documents()
doc_dir = "Path to TXT Files"
dicts = convert_files_to_dicts(dir_path=doc_dir, clean_func=clean_wiki_text, split_paragraphs=True)
document_store.write_documents(dicts)
retriever = DensePassageRetriever(
    document_store=document_store,
    query_embedding_model="facebook/dpr-question_encoder-single-nq-base",
    passage_embedding_model="facebook/dpr-ctx_encoder-single-nq-base",
)
document_store.update_embeddings(retriever)
reader = FARMReader(model_name_or_path="ahotrod/albert_xxlargev1_squad2_512", use_gpu=True)
pipeline = ExtractiveQAPipeline(reader, retriever)
port = 6021

@app.route("/main_router/", methods=["GET"])
def get_predictions():

    top_k_reader = 3
    top_k_retriever = 3
    query = request.args.get("q")

    prediction = pipeline.run(query=query, top_k_retriever=top_k_retriever, top_k_reader=top_k_reader)

    return Response(json.dumps(prediction))

if __name__ == "__main__":
    app.run(port=port)

I think the bottleneck is pipeline.run().

Can you please suggest something to this issue?

ZanSara commented 1 year ago

Will be addressed as part of the work for v2.