Dijital-Twin / model

0 stars 0 forks source link

research: QA Model #10

Closed MGurcan closed 6 months ago

MGurcan commented 6 months ago

What is Question Answering Models?

Question Answering models are used for finding the answers of a question from the context given. They are used to automate the response to frequently asked questions by using documents as context. They take a context and a question as a parameter and returns the best matched answer. Here’s an example usage:

question = "Where do I live?"
context = "My name is Merve and I live in İstanbul."
qa_model(question = question, context = context)
## {'answer': 'İstanbul', 'end': 39, 'score': 0.953, 'start': 31}

The Main Problem of QA

When a long context is given, the response time can be very very long. This is the issue we will probably face. And when we make some research how to optimize them we have achieved the Haystack.

What is Haystack

Haystack provides an optimizing system for searching for answer in the contextes. They use different pipelines to do it. For instance the ExtractiveQAPipeline does this task using Retriever and Reader components. The retriever searches through the documents of contextes and tries to find the most relevant documents for the given query(question). The reader accepts the documents the retriever returns and finds the best match answer for that question. After these researches, we have tried the roberta-base-squead2 model for QA.We will continue working on this issue, but in short, you can find some basic examples we’ve tried below:

Initialize the Retriever

from haystack.nodes import BM25Retriever
retriever = BM25Retriever(document_store=document_store)

Initialize the Reader

from haystack.nodes import FARMReader
reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2", use_gpu=True)

Creating the Retriever-Reader Pipeline

from haystack.pipelines import ExtractiveQAPipeline
pipe = ExtractiveQAPipeline(reader, retriever)

Asking a Question

prediction = pipe.run(
    query="What food does Rachel like most?", params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
)

By using top_k parameter it is possible to make a deep or shallow research by exchanging the time cost