deepset-ai / haystack

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
https://haystack.deepset.ai
Apache License 2.0
17.79k stars 1.92k forks source link

Update allowed models to be used with Prompt Node #4019

Closed sjrl closed 1 year ago

sjrl commented 1 year ago

Is your feature request related to a problem? Please describe. I would like to expand the list of allowed models from the HuggingFace Hub to be used with the PromptNode. We already recommend users to use models fine-tuned on instruction datasets (e.g. flan-t5 models), but as a user I would also like to experiment with models like google/pegasus (for summarization) and any other new state of the art models that might be released. Therefore, I relaxed the constraint to allow for any of the valid architectures that can be used with the text2text-generation pipeline offered by HF. This is the pipeline we use under the hood for HF models for the prompt node.

Describe the solution you'd like PR already opened with the change https://github.com/deepset-ai/haystack/pull/4018

Describe alternatives you've considered Leaving it as is restricts users to only use the flan-t5 models specifically from the google repo on HuggingFace.

vblagoje commented 1 year ago

@sjrl this is intentional - we allow only instruction following LLMs - otherwise, the current prompts won't work. If the current prompts don't work it might confuse users a lot. The reason why we allow T5-Flan and InstructGPT only is that the two models have decent support for a wide range of tasks, and the same prompts work quite well. For other generative text-to-text tasks, we can upgrade Seq2SeqGenerator instead. It's beyond ripe for refactoring.

Timoeller commented 1 year ago

Hey, I think it is really good to have good quality control in place and make sure users do not become frustrated. Thanks for thinking from the user perspective here @vblagoje

At the same time, I would like to encourage this through sensible default values, good documentation, and warning messages, rather than with strict rules. I imagine this feature being requested a lot and know already of one customer project where we will want to use multilingual OSS models. Is it correct that the current promptnode doesn't work with multilingual OSS (not GPT3) models? If so, we need to relax the constraint in any case (sorry 🥲 ).

vblagoje commented 1 year ago

I don't mind freedom and choice, especially for our users. If a model doesn't support instruction-following (IF) we should explain to users somehow that PromptNode is not going to work as it does for T5-Flan and InstructGPT and that they need to provide specialized PromptTemplate and use that particular template only. I am honestly not sure how we can do this well. All our documentation assumes that PromptNode is based on these IF models. Let's have a discussion and plan how to do this sensibly.