Open viosay opened 3 weeks ago
@viosay, before jumping into conclusion can you please share some context. What Spring AI version are you using? What Ollama Embedding model have you configured to use?
@viosay, before jumping into conclusion can you please share some context. What Spring AI version are you using? What Ollama Embedding model have you configured to use?
Sorry, my description was indeed lacking. I am using SpringAI version 1.0.0-M3
, and the Embedding models I am using on Ollama, such as shaw/dmeta-embedding-zh
, 893379029/piccolo-large-zh-v2
, and viosay/conan-embedding-v1
, all have this issue, but it should not be related to the models.
Thank you for the update @viosay , I see what you are trying to to achieve, but I'm not convinced using WebClient for non-streaming endpoint is the right solution. Let me think about it.
When I make vectorization embedding requests using Ollama, it takes a long time due to the performance issues of the server where Ollama is hosted. Therefore, I used
CompletableFuture
for asynchronous thread calls. However, in some cases, I need to manually interrupt the thread to stop the vectorization embedding request to Ollama. Normally, the request to Ollama should be interrupted when the thread is interrupted. But things didn't go as expected, because theembed
method ofOllamaApi
usesRestClient
. After callingfuture.cancel()
on the thread, the underlying I/O operation (such as an HTTP request) doesn't immediately respond to the interrupt signal, so the request continues even after the thread is interrupted.I noticed that
OpenaiApi
usesWebClient
, and I'm considering whether theembed
method ofOllamaApi
could be enhanced to support requests usingWebClient
. I could useMono.fromCallable
to start an asynchronous thread and manage the subscription withDisposable
. When I need to interrupt the thread, I could usedisposable.dispose()
to cancel the task.WebClient
should be able to respond better to the cancellation, thus interrupting the request to the Ollama service.This is just my perspective. I'm not sure if it's correct, but I will try to validate it.