google-ai-edge / mediapipe-samples

Apache License 2.0
1.52k stars 398 forks source link

Do not use Sessions for LLM Inference app #456

Closed schmidt-sebastian closed 1 day ago

schmidt-sebastian commented 1 week ago

Sessions lead to crashes in append-only chat apps. We use should the non-Session API as this creates a new implicit session for every message, thereby reducing the chance that we run out of tokens in the KV cache.