epam / ai-dial-core

The main component of AI DIAL, which provides unified API to different chat completion and embedding models, assistants, and applications
https://epam-rail.com
Apache License 2.0
56 stars 16 forks source link

Errors when trying to run Core locally from compose #237

Closed astsiapanay closed 6 months ago

astsiapanay commented 6 months ago

Hi ai-dial-core,

I'm trying to run Dial locally using docker compose from https://docs.epam-rail.com/quick-start/

After some tweaks, I was able to use GPT-4 by configuring dev-dial-core.staging as gpt-4 upstream (it works in Chat UI). Also, I was able to connect local instance Dial RAG to Chat UI.

But when Dial RAG is trying to request GPT-4 model, I'm getting an error:

core-1            | 2024-02-23 17:12:30.175 ERROR [vert.x-eventloop-thread-1] Can't handle request
core-1            | com.fasterxml.jackson.databind.exc.MismatchedInputException: No content to map due to end-of-input
core-1            |  at [Source: (String)""; line: 1, column: 0]
core-1            |     at com.fasterxml.jackson.databind.exc.MismatchedInputException.from(MismatchedInputException.java:59)
core-1            |     at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:4916)
core-1            |     at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4818)
core-1            |     at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3772)
core-1            |     at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3740)
core-1            |     at com.epam.aidial.core.limiter.RateLimiter.checkLimit(RateLimiter.java:94)
core-1            |     at com.epam.aidial.core.limiter.RateLimiter.lambda$limit$1(RateLimiter.java:80)
core-1            |     at io.vertx.core.impl.ContextBase.lambda$executeBlocking$0(ContextBase.java:167)
core-1            |     at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:277)
core-1            |     at io.vertx.core.impl.ContextBase.lambda$internalExecuteBlocking$2(ContextBase.java:199)
core-1            |     at io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)
core-1            |     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
core-1            |     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
core-1            |     at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
core-1            |     at java.lang.Thread.run(Thread.java:840)
astsiapanay commented 6 months ago

I could reproduce the issue

  1. docker-compose up
  2. create conversation
  3. ask question to gpt-4
  4. wait for reply from the model
  5. docker-compose stop immediately
  6. docker-compose start
  7. try to regenerate response.
  8. The the same error occurred

The reason is that core updates limits by writing data to redis the first and later to blob store with background thread. An empty string is written to Blob store just to make the object available for listing.

Sync interval to write redis data to blob store is 1 minute by default.

So the data is lost and broken because Core didn't finish writing to the blob store.

That has never happened to staging or any other envs yet. But it could.