quarkiverse / quarkus-langchain4j

Quarkus Langchain4j extension
https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html
Apache License 2.0
127 stars 73 forks source link

Questions arising when reading quarkus langchain4j documentation #766

Open emmanuelbernard opened 1 month ago

emmanuelbernard commented 1 month ago

https://docs.quarkiverse.io/quarkus-langchain4j/dev/ollama.html#quarkus-langchain4j-ollama_quarkus-langchain4j-ollama-chat-model-enabled What does "enabling a model" means? What happens when it's "off"

https://docs.quarkiverse.io/quarkus-langchain4j/dev/ollama.html#quarkus-langchain4j-ollama_quarkus-langchain4j-ollama-enable-integration When set to false, what impact does that have? Disableing requests means nothing works? the LLM is not called? What's the imapct of my app? OK found docs here https://docs.quarkiverse.io/quarkus-langchain4j/dev/enable-disable-integrations.html maybe reference it

https://docs.quarkiverse.io/quarkus-langchain4j/dev/ollama.html#quarkus-langchain4j-ollama_quarkus-langchain4j-ollama-chat-model-log-requests there is chat client log and chat model log,w hat's the difference, when to use which?

geoand commented 1 month ago

Thanks for raising these!

I'll put a PR together tomorrow that hopefully addresses the concerns and I'll ping you on it

emmanuelbernard commented 1 month ago

Also I'm strugging with quarkus.langchain4j.embedding-model.provider I've got Ollama and the in process provided one but I don't find the name I should put there (for the in process one) So I tried ot find all configuration properties in langchain4j but did not find it in the doc and it's not

geoand commented 1 month ago

I am not sure if Ollama has an embedding provider, I'll check tomorrow

geoand commented 1 month ago

I see that Ollama does provide an embedding. So if you have quarkus-langchain4j-ollama and no other dependency like quarkus-langchain4j-openai, then you don't need to configure quarkus.langchain4j.embedding-model.provider.

If you do have multiple ones, then quarkus.langchain4j.embedding-model.provider=ollama should be enough.

geoand commented 1 month ago

https://docs.quarkiverse.io/quarkus-langchain4j/dev/ollama.html#quarkus-langchain4j-ollama_quarkus-langchain4j-ollama-chat-model-enabled What does "enabling a model" means? What happens when it's "off"

This is really an advanced setting that helps in picking a model provider (Ollama, OpenAI, Mistral, Anthropic etc) when multiple ones are on the classpath.

https://docs.quarkiverse.io/quarkus-langchain4j/dev/ollama.html#quarkus-langchain4j-ollama_quarkus-langchain4j-ollama-enable-integration When set to false, what impact does that have? Disableing requests means nothing works? the LLM is not called? What's the imapct of my app?

What this does is essentially make every call to the model throw a ModelDisabledException. The idea behind this is that you can use Smallrye Fault Tolerance to handle fallback to a canned response - TBH I am far from convinced about the usefulness of this feature, but @edeandrea seems to be fond of it :)

edeandrea commented 1 month ago

What this does is essentially make every call to the model throw a ModelDisabledException. The idea behind this is that you can use Smallrye Fault Tolerance to handle fallback to a canned response - TBH I am far from convinced about the usefulness of this feature, but @edeandrea seems to be fond of it :)

This gives you the ability to deploy/run your app without an AI backend at all and be able to serve canned responses (if you are using things like SmallRye fault tolerance).

Say you want to quickly deploy your app somewhere without paying for a service like OpenAI or having to stand up your own Ollama instance. This gives you that ability. Similar with running in dev mode. If you don't want to run a local llm nor pay to connect to another, this gives you the ability to just have your app serve a canned response but otherwise function normally.

emmanuelbernard commented 1 month ago

How would a canned response work for prod environments? Answer, sorry the service is not available at the moment kind of thing?

emmanuelbernard commented 1 month ago

I see that Ollama does provide an embedding. So if you have quarkus-langchain4j-ollama and no other dependency like quarkus-langchain4j-openai, then you don't need to configure quarkus.langchain4j.embedding-model.provider.

If you do have multiple ones, then quarkus.langchain4j.embedding-model.provider=ollama should be enough.

for ollama that makes sense, what about when you want to use In-process embedding as described here https://docs.quarkiverse.io/quarkus-langchain4j/dev/in-process-embedding.html I had ollama in the path but wanted to use in process embedding. I did not find a solution so I created a specific @Produces that generates the in process embedding.

I wonder what the in-process experience should be? Should we have the model name as shortcut for quarkus.langchain4j.embedding-model.provider


quarkus.langchain4j.embedding-model.provider=bge-small-en-q #which is called 
bge-small-en-v1.5-quant in Langchain4j doc

# OR

quarkus.langchain4j.embedding-model.provider=in-process:bge-small-en-q

# OR

quarkus.langchain4j.embedding-model.provider=in-process
quarkus.langchain4j.embedding-model.model-name=bge-small-en-q

the user would still need to add the right dependency but we could raise an easy exception with the dependency coordinates
geoand commented 1 month ago

That's a very good point I never thought of.

quarkus.langchain4j.embedding-model.provider=in-process:bge-small-en-q

I like this solution best.

@jmartisk WDYT?

emmanuelbernard commented 1 month ago

BTW if new ones show up we can also support fqcn of the AbstractInProcessEmbeddingModel or the interface EmbeddingModel (I'm not familiar enough with langchain4j stability to know which is best

emmanuelbernard commented 1 month ago

https://docs.quarkiverse.io/quarkus-langchain4j/dev/ollama.html#quarkus-langchain4j-ollama_quarkus-langchain4j-ollama-chat-model-enabled What does "enabling a model" means? What happens when it's "off"

This is really an advanced setting that helps in picking a model provider (Ollama, OpenAI, Mistral, Anthropic etc) when multiple ones are on the classpath.

https://docs.quarkiverse.io/quarkus-langchain4j/dev/ollama.html#quarkus-langchain4j-ollama_quarkus-langchain4j-ollama-enable-integration When set to false, what impact does that have? Disableing requests means nothing works? the LLM is not called? What's the imapct of my app?

What this does is essentially make every call to the model throw a ModelDisabledException. The idea behind this is that you can use Smallrye Fault Tolerance to handle fallback to a canned response - TBH I am far from convinced about the usefulness of this feature, but @edeandrea seems to be fond of it :)

I think for both, we shuld explain what the property is useful for to guide comprehension.

geoand commented 1 month ago

BTW if new ones show up we can also support fqcn of the AbstractInProcessEmbeddingModel or the interface EmbeddingModel (I'm not familiar enough with langchain4j stability to know which is best

Should be possible

edeandrea commented 1 month ago

How would a canned response work for prod environments? Answer, sorry the service is not available at the moment kind of thing?

Perhaps you forget to pay your OpenAI bill, or you run out of Azure OpenAI credits (or something). This would at least let you gracefully handle the situation rather than continually hammer a downstream service where you know it is going to fail, preventing a potential cascading failure.

While that is certainly a use case, I see (& use myself) this feature in non-prod environments. Maybe during inner-loop or on CI/CD you don't have (or don't want to) set up a provider. You may not really care what the response is other than your AI service returns something so that your app is usable. This is a great feature in that case.

jmartisk commented 1 month ago

That's a very good point I never thought of.

quarkus.langchain4j.embedding-model.provider=in-process:bge-small-en-q

I like this solution best.

@jmartisk WDYT?

Personally I'm not a fan of magic shortcuts, FQCN is unambiguous and clear... But I think we could still try to improve the error message to suggest adding the right dependency (if the configured class name is one of the "known" ones)