Integrate LFQA with Haystack

lalitpagaria commented 3 years ago

Creating placeholder issue to integrate Open-domain long-form question answering (LFQA) with Haystack. I feel it is very relevant with Haystack.

Hopefully soon we will see good implementation in the regards. Or if someone is existed about experimenting on it then refer following paper which suggest two ways to achieve this - Article: https://ai.googleblog.com/2021/03/progress-and-challenges-in-long-form.html Paper: https://arxiv.org/abs/2103.06332 Dataset and Info: https://ai.facebook.com/blog/longform-qa/

lewtun commented 3 years ago

There's a very nice implementation by Yacine Jernite at 🤗 that could be used as a foundation to work from: https://yjernite.github.io/lfqa.html

He uses raw Elasticsearch for the retriever, so Haystack would certainly simplify a lot of that analysis!

lalitpagaria commented 3 years ago

@lewtun would you like to work on it? I think @Timoeller would be happy :)

lewtun commented 3 years ago

hey @lalitpagaria, i would love to tackle this but unfortunately have no bandwidth for it right now 😢 if that changes and the issue is still open, i'll have a stab at it!

vblagoje commented 3 years ago

@lalitpagaria @lewtun @tholor I'd like to take this one. I implemented a quick-and-dirty prototype using Yacine's models and it seems to be working ok. Although I don't have any metrics yet, I can see that the seq2seq model is indeed generating answers conditioned on documents given by the retriever.

What would be the ideal set of deliverables for LFQA? Perhaps we can implement LFQA in a few stages. In the first stage we can add initial implementation based on existing Yacine Jernite models, including demos, but without model training. In the next stage add model training, if needed. I am not sure how usable the training part would be as ELI5 seems to be the only dataset targeting LFQA, but I could be overlooking something here as I am relatively new to this particular task.

Perhaps we can add all of these in one PR? LMK your preferences.

vblagoje commented 3 years ago

Hey guys, here is LFQA implementation preview https://github.com/vblagoje/haystack/tree/lfqa_h , you can also checkout the notebook LMK what is the best way to proceed from here. I'd love to hear your feedback.

tholor commented 3 years ago

Awesome, thanks for working on it @vblagoje! A few thoughts on how to slice this work into meaningful stages / pull requests:

As you already proposed, having a first example implementation for inference only is a good first step
Adding the option to train (from plain language models) or fine-tune (take retribert and continue on your smaller domain dataset). Even if ELI5 is the dominant LFQA dataset out there, people in the industry might have smaller, private domain dataset that they want to use for fine-tuning. We know that similar domain datasets exist for DPR finetuning.
The models from Yacine Jernite seem like a great start. The next level would probably be C-REALM (Retriever) and RT (Generator) from https://arxiv.org/pdf/2103.06332.pdf. Just had a quick look some while ago but they seemed to outperform RAG+BART quite a bit. The sparse attention of RT allows for longer sequences which could be very interesting for the given generative task. (4. Adding generative models to our Benchmarking)

I haven't found the time yet to check your branch (and Yacine's notebook) in detail, but in all the above steps let's make sure to use meaningful abstractions for the retriever and generator classes that fit well with the rest of Haystack. What do I mean by that? We already have an EmbeddingRetriever (single encoder) and a DensePassageRetriever(dual encoder). If there's a big overlap with RetribertRetriever let's rather integrate it there. If not, let's create a new "generic" retriever class that captures the essence of Retribert but would also work with other base models (e.g. Roberta). Same goes for the Generator (RAG is quite specific here at the moment, but maybe there's potential for generalization).

Happy to review an early PR and give more detailed feedback!

lalitpagaria commented 3 years ago

Awesome work @vblagoje.

I looked into your code also checked https://yjernite.github.io/lfqa.html. I have same comments as @tholor -

I think we can leverage existing retrievers (sparse, dense and single encoder) as seq2seq model is doing encoding it's own before generating
For example purpose we can use yjernite/retribert-base-uncased instead of creating new RetribertRetriever class
I see lot of code duplication between Seq2SeqGenerator and RAGGenerator so it would be good to move common code to BaseGenerator class
Seq2SeqGenerator class can be moved to transformers.py file as in both cases we are using huggingface hosted model

tholor commented 3 years ago

Implemented in #1086

deepset-ai / haystack

Integrate LFQA with Haystack #914