OpenAI latency in gradio app (5+ seconds to generate a response)

I'm not sure what this is a problem with, but I'm going to post it here in case anyone else has the same issue or any idea how to fix it.

When testing locally, I was able to get a response in maybe 10-20 seconds, which is slow but acceptable at least.

When testing in HuggingFace spaces, I found it didn't generate a response at all and gave me this information in the logs:

2023-11-22 23:23:14,237:INFO - HTTP Request: POST http://localhost:7860/reset "HTTP/1.1 200 OK"
2023-11-22 23:23:16,315:INFO - HTTP Request: POST http://localhost:7860/api/predict "HTTP/1.1 200 OK"
2023-11-22 23:23:45,772:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:24:21,570:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:24:52,416:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:25:30,880:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:26:14,191:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:26:49,628:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:26:49,629:WARNING - Agent Lex Fridman failed to generate a response. Error: 'Choice' object is not subscriptable. Sending signal to end the conversation.

It may be due to the huggingface space using a different version of gradio, for some reason it runs this (note the version specified by our pyproject is gradio 3.34)

--> RUN pip install --no-cache-dir  gradio[oauth]==3.23.0   "uvicorn>=0.14.0"   spaces==0.18.0 gradio_client==0.0.2

I have no idea how to change this part of the huggingface build section, it looks like a dockerfile but I don't see one anywhere, maybe I'm missing something.

Farama-Foundation / chatarena

OpenAI latency in gradio app (5+ seconds to generate a response) #102