spring-projects / spring-ai

An Application Framework for AI Engineering
https://docs.spring.io/spring-ai/reference/index.html
Apache License 2.0
3.36k stars 866 forks source link

Ollama: Duration metrics deserialized to the wrong time unit #1796

Open jorander opened 1 week ago

jorander commented 1 week ago

Bug description Ollama reports duration values (total_duration, load_duration, prompt_eval_duration and eval_duration) in chat and embed responses as nanoseconds in JSON INTEGER format. In Spring AI the Jackson class DurationDeserializer from the jsr310 module is used to deserialize these values into Java Duration objects. In this process the integer value is interpreted as seconds, instead of nanoseconds, making the Duration 10^9 times larger.

This is because the DurationDeserializer expects durations with nanosecond precision to be formatted as decimal values with a decimal separator (dot) separating the seconds part from the nanoseconds part. Durations formatted as integers are, depending on context settings, interpreted as seconds (default) or milliseconds. None of these work for durations reported by Ollama.

Environment Spring AI: 1.0.0-M4 Java: 21 Ollama: 0.4.2

Steps to reproduce Run a chat-request towards an Ollama server. Compare the durations reported by Ollama with the values found in the OllamaApi.ChatResponse and propagated to the map available at ChatClient.ChatResponse#getMetadata.

Expected behavior Serialization of Duration takes into account that Ollama reports duration values as a JSON INTEGER value in nanoseconds.

Minimal Complete Reproducible example None at the moment.

jorander commented 1 week ago

Modified unit tests to reproduce error in connected PR.