quarkiverse / quarkus-langchain4j

Quarkus Langchain4j extension
https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html
Apache License 2.0
134 stars 80 forks source link

feat: ✨ ollama streaming mode #522

Closed philippart-s closed 5 months ago

philippart-s commented 5 months ago

PR to try to add the streaming feature to Ollama provider through the extension 😉

First commits are the setup of the the feature 🏗️.

geoand commented 5 months ago

Thanks, this is a nice start! The most important part now is the implementation of OllamaStreamingChatLanguageModel.

philippart-s commented 5 months ago

Thanks, this is a nice start! The most important part now is the implementation of OllamaStreamingChatLanguageModel.

Yes, I've started the implementation, I need to correct one or two mistakes but it's coming along!

andreadimaio commented 5 months ago

@philippart-s I think that this link can help you to understand how to create the REST API call for the streaming. About the code implementation, you might want to take a look at what I did for the BAM module here and here (I hope this can help you).

philippart-s commented 5 months ago

@geoand : I've a first version that I want to test but when I run the integration tests (for example https://github.com/quarkiverse/quarkus-langchain4j/tree/main/integration-tests/ollama, I've this error: 2024-04-30 15:13:06,715 ERROR [io.qua.dep.dev.IsolatedDevModeMain] (main) Failed to start quarkus: java.lang.IllegalStateException: No config found for interface io.quarkiverse.langchain4j.runtime.aiservice.ChatMemoryConfig

I run the quarkus dev command. Is there a tips to run the examples in the integration-test folder?

geoand commented 5 months ago

🎉

I would first build the entire project and then do mvn quarkus:dev in the integration-tests/ollama directory.

philippart-s commented 5 months ago

🎉

I would first build the entire project and then do mvn quarkus:dev in the integration-tests/ollama directory.

it's what I did:

➜ cd quarkus-langchain4j/integration-tests/ollama 

➜ mvn quarkus:dev
[INFO] Scanning for projects...
[INFO] 
[INFO] --< io.quarkiverse.langchain4j:quarkus-langchain4j-integration-test-ollama >--
[INFO] Building Quarkus LangChain4j - Integration Tests - Ollama 999-SNAPSHOT
[INFO]   from pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- quarkus:3.8.2:dev (default-cli) @ quarkus-langchain4j-integration-test-ollama ---
[INFO] Invoking enforcer:3.4.1:enforce (enforce-java-version) @ quarkus-langchain4j-integration-test-ollama
[INFO] Rule 0: org.apache.maven.enforcer.rules.BannedRepositories passed
[INFO] Rule 1: org.apache.maven.enforcer.rules.version.RequireJavaVersion passed
[INFO] Invoking enforcer:3.4.1:enforce (enforce-maven-version) @ quarkus-langchain4j-integration-test-ollama
[INFO] Rule 0: org.apache.maven.enforcer.rules.version.RequireMavenVersion passed
[INFO] Invoking sundr:0.103.1:generate-bom (default) @ quarkus-langchain4j-integration-test-ollama
[INFO] Invoking buildnumber:3.2.0:create (get-scm-revision) @ quarkus-langchain4j-integration-test-ollama
[INFO] Executing: /bin/sh -c cd '/Users/stef/Dev/quarkus-langchain4j/integration-tests/ollama' && 'git' 'rev-parse' '--verify' 'HEAD'
[INFO] Working directory: /Users/stef/Dev/quarkus-langchain4j/integration-tests/ollama
[INFO] Storing buildNumber: efc0de2213cdc181d26d9377a46890c403b0642b at timestamp: 1714568697753
[INFO] Executing: /bin/sh -c cd '/Users/stef/Dev/quarkus-langchain4j/integration-tests/ollama' && 'git' 'symbolic-ref' 'HEAD'
[INFO] Working directory: /Users/stef/Dev/quarkus-langchain4j/integration-tests/ollama
[INFO] Storing scmBranch: ollama-streaming
[INFO] Invoking formatter:2.23.0:format (format-sources) @ quarkus-langchain4j-integration-test-ollama
[INFO] Processed 1 files in 259ms (Formatted: 0, Skipped: 1, Unchanged: 0, Failed: 0, Readonly: 0)
[INFO] Invoking impsort:1.9.0:sort (sort-imports) @ quarkus-langchain4j-integration-test-ollama
[INFO] Processed 1 files in 00:00.005 (Already Sorted: 1, Needed Sorting: 0)
[INFO] Invoking resources:3.3.1:resources (default-resources) @ quarkus-langchain4j-integration-test-ollama
[INFO] Copying 1 resource from src/main/resources to target/classes
[INFO] Invoking compiler:3.12.1:compile (default-compile) @ quarkus-langchain4j-integration-test-ollama
[INFO] Nothing to compile - all classes are up to date.
[INFO] Invoking resources:3.3.1:testResources (default-testResources) @ quarkus-langchain4j-integration-test-ollama
[INFO] skip non existing resourceDirectory /Users/stef/Dev/quarkus-langchain4j/integration-tests/ollama/src/test/resources
[INFO] Invoking compiler:3.12.1:testCompile (default-testCompile) @ quarkus-langchain4j-integration-test-ollama
[INFO] No sources to compile
[WARNING] [io.quarkus.bootstrap.devmode.DependenciesFilter] Live reload was disabled for the following project artifacts:
- io.quarkiverse.langchain4j:quarkus-langchain4j-ollama:999-SNAPSHOT
- io.quarkiverse.langchain4j:quarkus-langchain4j-core:999-SNAPSHOT
- io.quarkiverse.langchain4j:quarkus-langchain4j-core-runtime-spi:999-SNAPSHOT
- io.quarkiverse.langchain4j:quarkus-langchain4j-core-deployment:999-SNAPSHOT
- io.quarkiverse.langchain4j:quarkus-langchain4j-ollama-deployment:999-SNAPSHOT
The artifacts above appear to be either dependencies of non-reloadable application dependencies or Quarkus extensions
Listening for transport dt_socket at address: 5005
2024-05-01 15:05:00,120 INFO  [io.qua.dep.dev.IsolatedDevModeMain] (main) Attempting to start live reload endpoint to recover from previous Quarkus startup failure
2024-05-01 15:05:00,335 ERROR [io.qua.dep.dev.IsolatedDevModeMain] (main) Failed to start quarkus: java.lang.IllegalStateException: No config found for interface io.quarkiverse.langchain4j.runtime.aiservice.ChatMemoryConfig
        at io.quarkus.deployment.ExtensionLoader.loadStepsFrom(ExtensionLoader.java:186)
        at io.quarkus.deployment.QuarkusAugmentor.run(QuarkusAugmentor.java:107)
        at io.quarkus.runner.bootstrap.AugmentActionImpl.runAugment(AugmentActionImpl.java:330)
        at io.quarkus.runner.bootstrap.AugmentActionImpl.createInitialRuntimeApplication(AugmentActionImpl.java:251)
        at io.quarkus.runner.bootstrap.AugmentActionImpl.createInitialRuntimeApplication(AugmentActionImpl.java:60)
        at io.quarkus.deployment.dev.IsolatedDevModeMain.firstStart(IsolatedDevModeMain.java:112)
        at io.quarkus.deployment.dev.IsolatedDevModeMain.accept(IsolatedDevModeMain.java:433)
        at io.quarkus.deployment.dev.IsolatedDevModeMain.accept(IsolatedDevModeMain.java:55)
        at io.quarkus.bootstrap.app.CuratedApplication.runInCl(CuratedApplication.java:138)
        at io.quarkus.bootstrap.app.CuratedApplication.runInAugmentClassLoader(CuratedApplication.java:93)
        at io.quarkus.deployment.dev.DevModeMain.start(DevModeMain.java:131)
        at io.quarkus.deployment.dev.DevModeMain.main(DevModeMain.java:62)

I have this error for all tests in the integration-tests, I should do something wrong but I don't understand what.

geoand commented 5 months ago

Weird... I've never seen that happen and it works fine for me (and CI)

philippart-s commented 5 months ago

I'll check again all my configuration to see if I have an issue on my local configuration.

geoand commented 5 months ago

Would you like to push what you have so I can check it out locally?

philippart-s commented 5 months ago

after a fresh build it's working 🎉 I'm going to push my current code but I think it's not perfect (I haven't dev the number of tokens yet).

philippart-s commented 5 months ago

@geoand here is the first version with streaming mode. Sorry but at the end there is an error:

 ERROR [org.jbo.res.rea.ser.han.PublisherResponseHandler] (vert.x-eventloop-thread-0) Exception in SSE server handling, impossible to send it to client: org.jboss.resteasy.reactive.ClientWebApplicationException: HTTP 200 OK
        at io.quarkus.rest.client.reactive.jackson.runtime.serialisers.ClientJacksonMessageBodyReader.readFrom(ClientJacksonMessageBodyReader.java:57)
        at org.jboss.resteasy.reactive.client.impl.ClientReaderInterceptorContextImpl.proceed(ClientReaderInterceptorContextImpl.java:86)
        at org.jboss.resteasy.reactive.client.impl.ClientSerialisers.invokeClientReader(ClientSerialisers.java:160)
        at org.jboss.resteasy.reactive.client.impl.RestClientRequestContext.readEntity(RestClientRequestContext.java:208)
        at org.jboss.resteasy.reactive.client.impl.MultiInvoker$3.handle(MultiInvoker.java:321)
        at org.jboss.resteasy.reactive.client.impl.MultiInvoker$3.handle(MultiInvoker.java:288)
        at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:276)
        at io.vertx.core.http.impl.HttpEventHandler.handleChunk(HttpEventHandler.java:51)
        at io.vertx.core.http.impl.HttpClientResponseImpl.handleChunk(HttpClientResponseImpl.java:239)
        at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.lambda$new$0(Http1xClientConnection.java:426)
        at io.vertx.core.streams.impl.InboundBuffer.handleEvent(InboundBuffer.java:255)
        at io.vertx.core.streams.impl.InboundBuffer.write(InboundBuffer.java:134)
        at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.handleChunk(Http1xClientConnection.java:701)
        at io.vertx.core.impl.ContextImpl.execute(ContextImpl.java:320)
        at io.vertx.core.impl.DuplicatedContext.execute(DuplicatedContext.java:171)
        at io.vertx.core.http.impl.Http1xClientConnection.handleResponseChunk(Http1xClientConnection.java:889)
        at io.vertx.core.http.impl.Http1xClientConnection.handleHttpMessage(Http1xClientConnection.java:808)
        at io.vertx.core.http.impl.Http1xClientConnection.handleMessage(Http1xClientConnection.java:775)
        at io.vertx.core.net.impl.ConnectionBase.read(ConnectionBase.java:159)
        at io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:153)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
        at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input: expected close marker for Object (start marker at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 1])
 at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 93]
        at com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:699)
        at com.fasterxml.jackson.core.base.ParserBase._handleEOF(ParserBase.java:514)
        at com.fasterxml.jackson.core.base.ParserBase._eofAsNextChar(ParserBase.java:531)
        at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipWSOrEnd(UTF8StreamJsonParser.java:3107)
        at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:760)
        at com.fasterxml.jackson.databind.deser.BuilderBasedDeserializer.vanillaDeserialize(BuilderBasedDeserializer.java:286)
        at com.fasterxml.jackson.databind.deser.BuilderBasedDeserializer.deserialize(BuilderBasedDeserializer.java:217)
        at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:342)
        at com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:2125)
        at com.fasterxml.jackson.databind.ObjectReader.readValue(ObjectReader.java:1501)
        at io.quarkus.rest.client.reactive.jackson.runtime.serialisers.ClientJacksonMessageBodyReader.readFrom(ClientJacksonMessageBodyReader.java:53)
        ... 42 more

I think I've reached the limit of my knowledge on how to call Ollama and how to integrate the streaming function into this extension. If you have some time to help me finish the development, I'd be grateful.

geoand commented 5 months ago

Thanks a lot @philippart-s!

I will take your code and finish the implementation soon.

philippart-s commented 5 months ago

Thanks, don't hesitate to ping me, I'll see what you fix in my code and be more autonomous the next time. Sorry to not be able to do all the code alone.

geoand commented 5 months ago

No need to apologize!

Thanks a lot for getting the ball rolling on this one!

I will ping you when I've completed the PR :)

geoand commented 5 months ago

The problem with the exception you are seeing turns out to be weirder than I thought...

All we can do for the time being is workaround it as I have done.

Do you mind checking if things work for you with the latest version (I've force pushed to your branch so you'll have to be careful when pulling)?

philippart-s commented 5 months ago

yes, I'm going to try it as soon as I have a bit of time in my day 😉

philippart-s commented 5 months ago

@geoand I tested with my app and it seems ok 👌. Thanks to completed the code to made it correct!

Just for my information, for the next PR, what type of configuration for the code formatter is used? (to avoid the maven verify error on the CI 😅)

geoand commented 5 months ago

Thanks for checking!

All I do is mvn install -f ollama and the build does the formatting automatically

adriens commented 5 months ago

:clap: