abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
8.16k stars 970 forks source link

Fix: add missing exit_stack.close() to end of /v1/completions endpoint #1795

Closed gjpower closed 1 month ago

gjpower commented 1 month ago

This was causing the llama proxy lock to be retained until exit_stack is cleaned up by garbage collector

gjpower commented 1 month ago

1798 supersedes this MR as I found thread starvation issue was not just due to improper closing of exit_stack but due to thread starvation from dependency pull of llama_proxy locking all worker threads