Second request never completes

I tried integrating Ollama into GPT-Subtrans this weekend.

It launches ollama in a subprocess and makes a series of requests to translate subtitles in batches. The first request works, but it hangs on the second request.

DEBUG - send_request_headers.started request=<Request [b'POST']> DEBUG - send_request_headers.complete DEBUG - send_request_body.started request=<Request [b'POST']> DEBUG - send_request_body.complete DEBUG - receive_response_headers.started request=<Request [b'POST']> ... hangs here, never receives response

I put together a test script which shows the same behaviour: https://github.com/machinewrapped/Bugs/blob/master/OllamaTest/ollama_test.py

Is there something I need to do between requests to prepare the server? It doesn't seem to matter whether I use generate or chat, the result is the same.

ollama / ollama-python

Second request never completes #109