groq not supported - would help having fast inference

NVIDIA / NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

Other

4.17k stars 393 forks source link

groq not supported - would help having fast inference #493

Closed pechaut78 closed 5 months ago

pechaut78 commented 6 months ago

Groq can not be set as main modele in the YML, even though it is supported by langchain

drazvan commented 6 months ago

Hi @pechaut78 !

Can you share the config section where you tried setting groq as the main model? It should work with any provider supported by LangChain. Maybe there's a name mismatch somewhere.

pechaut78 commented 6 months ago

models:

type: main engine: groq

the langchain api is langchain_groq, it fails saying that the model is not listed as supported among Azure etc.

pechaut78 commented 6 months ago

groq is not a model but a cloud service (see groq cloud) be aware that this is not Elon's model.

drazvan commented 6 months ago

I see that groq is not listed here: https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/llms/__init__.py. This means you need to register it by manually.

If you add the following to your config.py (next to config.yml):

from langchain_groq import ChatGroq
from nemoguardrails.llm.providers import register_llm_provider

register_llm_provider("groq", ChatGroq)

and in your config.yml:

models:
  - type: main
    engine: groq

It should work. I've just tested. And don't forget pip install langchain-groq as per https://python.langchain.com/v0.1/docs/integrations/chat/groq/.

pechaut78 commented 6 months ago

Jesus, I was just coming to the same conclusion using the code:

from langchain_community import llms

if hasattr(llms, "get_type_to_cls_dict"):
        type_to_cls_dict = {
            k: v()
            for k, v in llms.get_type_to_cls_dict().items()

        }

For some reason (!) it is not registered..

thanks a lot, I'm submitting an issue on the langchain_community

pechaut78 commented 6 months ago

Ok, it runs the model but.. the rails are not taking the instructions into account. Works fine with GPT, switch to mistral through groq : does not work anymore, it answers forbidden questions :-)

drazvan commented 6 months ago

Yes, we need to figure out some prompts that work well for mixtral. We're looking into that on another project as well.

fwkhan commented 4 months ago

Ok, it runs the model but.. the rails are not taking the instructions into account. Works fine with GPT, switch to mistral through groq : does not work anymore, it answers forbidden questions :-)

I am experiencing the same behaviour, Groq with llama3 doesn't seem to work even though the model runs fine

Vikas9758 commented 3 months ago

Yes, we need to figure out some prompts that work well for mixtral. We're looking into that on another project as well.

Hey did you find anything on the same or could you give me some advice.

pechaut78 commented 3 months ago

I found that it nemo works fine with GPT, correctly with Llama 70B (with some incorrect answers sometimes), poorly with Llama 7b or Mixtral.

Vikas9758 commented 3 months ago

I found that it nemo works fine with GPT, correctly with Llama 70B (with some incorrect answers sometimes), poorly with Llama 7b or Mixtral.

Hmm,Ok Is it possible to use 2 models.Like 1 used in config.yml for checking whether the input prompt and output prompt are fine to display.But generate answer using a different LLM. I propose to do so as we can pass our llm in LLMRails params. I might be wrong here. But if that would be possible then wouldn't it be great

pechaut78 commented 3 months ago

You can have a LLM for nemo and a different LLM for the rest of the chain, so you can have GPT for nemo and another model after that.

Vikas9758 commented 3 months ago

great,Can you share an example if possible.I am trying to use it in one of my project

drazvan commented 3 months ago

You can also have different LLMs for the various "tasks" during the guardrail process:

models:
  ...
  - type: self_check_input
    engine: ...
    model: ...
  - type: self_check_output
    ...
  - type: generate_bot_message
    ...

More generally, for every action that defines the llm parameter, if you define an LLM model with the type set to the name of the action, it will use that model. Otherwise, it uses the main model.