svilupp / PromptingTools.jl

Streamline your life using PromptingTools.jl, the Julia package that simplifies interacting with large language models.
https://svilupp.github.io/PromptingTools.jl/dev/
MIT License
127 stars 14 forks source link

Ollama: repeated request with same prompt fails #51

Closed asbisen closed 10 months ago

asbisen commented 10 months ago

Operating System: macOS 14.2.1 (M1 Max) Julia: v1.10.0 PromptingTools: v0.7.0

The code below runs successfully for the first time and subsequent invocation of the code fails with KeyError: key :prompt_eval_count not found. If the prompt is modified with new text then the code executes successfully again.

using PromptingTools: SystemMessage, UserMessage
using PromptingTools

const PT = PromptingTools
schema = PT.OllamaManagedSchema()

conversation = [
    SystemMessage("You're master Yoda from Star Wars trying to help the user become a Jedi."),
    UserMessage("I have feelings for my {{object}}. What should I do?")]

msg = aigenerate(schema, 
            conversation; 
            object = "old iPhone", 
            model="mistral", api_kwargs=(; temperature=0.1))

First Run: Successful Second Run: Error

ERROR: KeyError: key :prompt_eval_count not found
Stacktrace:
 [1] getindex
   @ ./dict.jl:498 [inlined]
 [2] get(obj::JSON3.Object{Vector{UInt8}, Vector{UInt64}}, key::Symbol)
   @ JSON3 ~/.julia/packages/JSON3/jSAdy/src/JSON3.jl:87
 [3] getindex(obj::JSON3.Object{Vector{UInt8}, Vector{UInt64}}, key::Symbol)
   @ JSON3 ~/.julia/packages/JSON3/jSAdy/src/JSON3.jl:128
 [4] aigenerate(prompt_schema::PromptingTools.OllamaManagedSchema, prompt::Vector{…}; verbose::Bool, api_key::String, model::String, return_all::Bool, dry_run::Bool, conversation::Vector{…}, http_kwargs::@NamedTuple{}, api_kwargs::@NamedTuple{…}, kwargs::@Kwargs{…})
   @ PromptingTools ~/.julia/packages/PromptingTools/O8tph/src/llm_ollama_managed.jl:217
 [5] top-level scope
   @ ~/Desktop/code/mlnotes/code/prompting_tools/prompting_system.jl:11
Some type information was truncated. Use `show(err)` to see complete types.
svilupp commented 10 months ago

Thank you for catching this! Not sure when they introduced it, but it seems that Ollama dropped some keys in the response if the prompt was cached.

It should be fixed now - please re-open if it persists!

EDIT: Once you confirm, I'll tag a new release.