Closed erhant closed 4 months ago
Hmm this may actually be a bug somewhere, I am looking at the logs of Ollama (im running on a Mac, with ollama serve
) and apparently sometimes the prompt token count is shown to be 0. When that happens, I get None
for the final data.
Here is an example log with a filled final_data
, we can see the prompt had 98 tokens:
{"function":"print_timings","level":"INFO","line":276,"msg":"prompt eval time = 369.77 ms / 98 tokens ( 3.77 ms per token, 265.03 tokens per second)","n_prompt_tokens_processed":98,"n_tokens_second":265.02602934216753,"slot_id":0,"t_prompt_processing":369.775,"t_token":3.7732142857142854,"task_id":3,"tid":"0x1f699e500","timestamp":1715174643}
Here is another log for the same prompt, same settings:
{"function":"print_timings","level":"INFO","line":276,"msg":"prompt eval time = 206.29 ms / 0 tokens ( inf ms per token, 0.00 tokens per second)","n_prompt_tokens_processed":0,"n_tokens_second":0.0,"slot_id":0,"t_prompt_processing":206.291,"t_token":null,"task_id":65,"tid":"0x1f699e500","timestamp":1715174941}
Now we get 0 tokens, and therefore inf
appears for ms per token. When this happens final_data
is None
.
This may be an issue on both Ollama-rs and Ollama itself. When im playing around with the API itself, not all fields of the final_data
are present, for example:
❯ curl http://localhost:11434/api/generate -d '{
"model": "orca-mini",
"stream": false,
"prompt": "Why is the sky blue?"
}'
This results in:
{"model":"orca-mini","created_at":"2024-05-08T13:49:51.345652Z","response":" The sky appears blue because of a phenomenon called Rayleigh scattering. When sunlight enters the Earth's atmosphere, it interacts with gas molecules such as nitrogen and oxygen. These molecules scatter the light in all directions, but they scatter more easily towards the longer wavelengths of light (violet and blue) than towards the shorter wavelengths (red and orange). As a result, more violet and blue light is scattered towards our eyes, making the sky appear blue.","done":true,"context":[31822,13,8458,31922,3244,31871,13,3838,397,363,7421,8825,342,5243,10389,5164,828,31843,9530,362,988,362,365,473,31843,13,13,8458,31922,9779,31871,13,12056,322,266,7661,4842,31902,13,13,8458,31922,13166,31871,13,347,7661,4725,4842,906,287,260,12329,1676,6697,27554,27289,31843,1408,21062,16858,266,4556,31876,31829,7965,31844,357,8728,31829,351,3740,16450,859,362,22329,291,11944,31843,1872,16450,640,3304,266,1954,288,484,11468,31844,504,526,640,3304,541,4083,3395,266,3002,23893,31829,287,1954,352,31845,20036,291,4842,31861,661,3395,266,13830,23893,31829,352,443,291,13145,656,717,260,1190,31844,541,399,20036,291,4842,1954,322,18752,3395,611,3659,31844,1714,266,7661,2024,4842,31843],"total_duration":2946483458,"load_duration":2541000,"prompt_eval_duration":193435000,"eval_count":97,"eval_duration":2749070000}
Here for instance, prompt_eval_count
does not exist, but other fields do. Serde probably makes the whole thing None
as the object as a whole is not there to be deserialized and flattened.
A-ha, found it: https://github.com/ollama/ollama/issues/3427
Indeed, if the prompt is cached it will omit the field and cause the entire thing to be None
on this side.
I would like to propose that we write each field explicitly, instead of #[serde(flatten)]
.
That issue rings a bell, I think someone already created an issue for that but we thought it was a bug on ollama side. We need to make this field optional then, I'll do that when I get home.
@pepperoni21 I made a quick fix in PR above^
I think I was the one who originally reported this issue here #27 (and also thought it was on Ollama's side as I didn't know about it caching prompts).
@erhant , thank you for the fix... I have read the commit, but I don't understand exactly how that solved the problem.
If you have a few minutes to spare, could you explain how removing serde's flatten
and making each field an Option
solved it?
Thanks
hi @dezoito, sure!
the thing is when used with flatten
, a single missing field in the to-be-flattened nested object causes the entire nested object to fail. below is a demonstration:
use serde::{Deserialize, Serialize};
#[derive(Debug, Serialize, Deserialize)]
struct Flattened {
foo: u32,
bar: Option<u32>,
baz: Option<u32>,
}
#[derive(Debug, Serialize, Deserialize)]
struct Nested {
foo: u32,
#[serde(flatten)]
barbaz: Option<NestedInner>,
}
#[derive(Debug, Serialize, Deserialize)]
struct NestedInner {
bar: u32,
baz: u32,
}
fn main() {
const OBJ_NESTED: &str = "{\"foo\": 1, \"bar\": 2, \"baz\": 3}";
let obj_nested: Nested = serde_json::from_str(OBJ_NESTED).unwrap();
println!("{:?}", obj_nested);
const OBJ_FLATTENED: &str = "{\"foo\": 1, \"bar\": 2, \"baz\": 3}";
let obj_flattened: Flattened = serde_json::from_str(OBJ_FLATTENED).unwrap();
println!("{:?}", obj_flattened);
const OBJ_NESTED_MISSING: &str = "{\"foo\": 1, \"bar\": 2}";
let obj_nested_missing: Nested = serde_json::from_str(OBJ_NESTED_MISSING).unwrap();
println!("{:?}", obj_nested_missing);
const OBJ_FLATTENED_MISSING: &str = "{\"foo\": 1, \"bar\": 2}";
let obj_flattened_missing: Flattened = serde_json::from_str(OBJ_FLATTENED_MISSING).unwrap();
println!("{:?}", obj_flattened_missing);
}
When we run it, we see the result:
Nested { foo: 1, barbaz: Some(NestedInner { bar: 2, baz: 3 }) }
Flattened { foo: 1, bar: Some(2), baz: Some(3) }
Nested { foo: 1, barbaz: None }
Flattened { foo: 1, bar: Some(2), baz: None }
So although bar
existed in the 3rd example, since baz
didn't exist the entire barbaz
failed and we missed the bar
data. This was the case with Ollama, as it didnt return one of the fields in the final data, causing the entire thing to be lost.
Perhaps there is a better serde feature that does the fix for this instead of manually flattening the nested Option
object to separate Option
fields. That I don't know!
Thank you! Much appreciated.
Hi, i would like to be able to get the
GenerationFinalResponseData
in a 100% certain way, but it seems it is sometimes there and sometimes not.Im using
generate
function, sostream
should be false anyways. It also appears that ifdone
is true, the final data may still beNone
.