deepset-ai / haystack

:mag: AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
https://haystack.deepset.ai
Apache License 2.0
17.24k stars 1.89k forks source link

Support additional options to boost search (re-ranking, most-similar search ...). #236

Closed Utomo88 closed 4 years ago

Utomo88 commented 4 years ago

Question At the Introduction it mention: We will soon support additional options to boost search (re-ranking, most-similar search ...). When we will have this ?

Thank you

tholor commented 4 years ago

Hi @Utomo88 ,

We initially had those options quite early on our roadmap, but deprioritized them then in favour of more features around QA (especially reader speed & retrieval quality) as we first want to master QA before extending the scope. This is also in line with feedback from the community as most users focus on QA and are very interested in better speed for production deployments.

I agree that the sentence in the readme was misleading and I updated it now. That being said, what options for neural search are you particularly interested in? While we don't have full support for most-similar search yet (e.g. REST API), you can already use an EmbeddingRetriever today and search for most-similar documents.

Utomo88 commented 4 years ago

I am Interesting in Question and Answer AI which can try to find answer from ebooks collection I have. Mostly in PDF and EPUB. Something like Google Talk to Books but smarter and also using my Own Ebooks collection https://books.google.com/talktobooks/

I am still looking at the Haystack. Is there any GUI to Input the ebooks collection / to train, and then GUI to ask question and get Answer or only Via API ?

tholor commented 4 years ago

Ok, but if I got your project right, you are then also more interested in good, scalable QA than e.g. most-similar-search / re-ranking?!

The focus of this project is really to build an open, powerful developer framework + API (so no GUI). However, we are also working on "Haystack Hub", which is an additional layer on top that accelerates many workflows (particularly in the enterprise setting) and might contain some of the GUI elements you are interested in. We plan to release a SaaS version later this year.

Did you have a look at streamlit? It should be fairly straightforward to build a GUI that sends questions to the Haystack API and displays the answers.

Utomo88 commented 4 years ago

Yes, I want good QA, But I tought that similar search or re ranking is something like this. I have question A, example : Whatis the best way to increase sales of the products and I have some books which can give some answer. Book A have Some answer : a Direct selling b Internet promotion using social media c Affiliate marketing and other

Book B have some answer a Affiliate marketing b copywriting c advertising

Book C have answer. a advertising b social selling c others

Base on that answer the Haystack can give best answer

Where I can find more info regarding the Haystack Hub ?

tholor commented 4 years ago

I have question A, example : What is the best way to increase sales of the products and I have some books which can give some answer.

Haystack can definitely be used to find those different answers from Books A, B, C via extractive QA. No need for most similar search / re-ranking from my perspective :)

Where I can find more info regarding the Haystack Hub ?

We plan a release for late autumn. Happy to give an update here once we released the first version.

I am closing this issue for now, but feel free to re-open, if I haven't answered your questions sufficiently :)