fix: Avoid thread starvation on many concurrent requests by making use of asyncio to lock llama_proxy context

Supersedes previous MR #1795

Previous implementation creates and locks threads when acquiring llama_proxy, this can cause thread starvation on many parallel requests. This also prevents call to await run_in_threadpool(llama.create_chat_completion, **kwargs) proceeding as all worker threads are stuck awaiting lock so no progress may be made.

This MR adapts acquiring of llama_proxy to async pattern taking advantage of asyncio mechanisms. ExitStack is replaced with AsyncExitStack and improper closing of the ExitStack is addressed

abetlen / llama-cpp-python

fix: Avoid thread starvation on many concurrent requests by making use of asyncio to lock llama_proxy context #1798