I tried integrating Ollama into GPT-Subtrans this weekend.
It launches ollama in a subprocess and makes a series of requests to translate subtitles in batches. The first request works, but it hangs on the second request.
Is there something I need to do between requests to prepare the server? It doesn't seem to matter whether I use generate or chat, the result is the same.
I tried integrating Ollama into GPT-Subtrans this weekend.
It launches ollama in a subprocess and makes a series of requests to translate subtitles in batches. The first request works, but it hangs on the second request.
DEBUG - send_request_headers.started request=<Request [b'POST']> DEBUG - send_request_headers.complete DEBUG - send_request_body.started request=<Request [b'POST']> DEBUG - send_request_body.complete DEBUG - receive_response_headers.started request=<Request [b'POST']> ... hangs here, never receives response
I put together a test script which shows the same behaviour: https://github.com/machinewrapped/Bugs/blob/master/OllamaTest/ollama_test.py
Is there something I need to do between requests to prepare the server? It doesn't seem to matter whether I use generate or chat, the result is the same.