ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
Apache License 2.0
1.35k
stars
130
forks
source link
OpenAI latency in gradio app (5+ seconds to generate a response) #102
I'm not sure what this is a problem with, but I'm going to post it here in case anyone else has the same issue or any idea how to fix it.
When testing locally, I was able to get a response in maybe 10-20 seconds, which is slow but acceptable at least.
When testing in HuggingFace spaces, I found it didn't generate a response at all and gave me this information in the logs:
2023-11-22 23:23:14,237:INFO - HTTP Request: POST http://localhost:7860/reset "HTTP/1.1 200 OK"
2023-11-22 23:23:16,315:INFO - HTTP Request: POST http://localhost:7860/api/predict "HTTP/1.1 200 OK"
2023-11-22 23:23:45,772:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:24:21,570:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:24:52,416:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:25:30,880:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:26:14,191:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:26:49,628:INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2023-11-22 23:26:49,629:WARNING - Agent Lex Fridman failed to generate a response. Error: 'Choice' object is not subscriptable. Sending signal to end the conversation.
It may be due to the huggingface space using a different version of gradio, for some reason it runs this (note the version specified by our pyproject is gradio 3.34)
--> RUN pip install --no-cache-dir gradio[oauth]==3.23.0 "uvicorn>=0.14.0" spaces==0.18.0 gradio_client==0.0.2
I have no idea how to change this part of the huggingface build section, it looks like a dockerfile but I don't see one anywhere, maybe I'm missing something.
I'm not sure what this is a problem with, but I'm going to post it here in case anyone else has the same issue or any idea how to fix it.
When testing locally, I was able to get a response in maybe 10-20 seconds, which is slow but acceptable at least.
When testing in HuggingFace spaces, I found it didn't generate a response at all and gave me this information in the logs:
It may be due to the huggingface space using a different version of gradio, for some reason it runs this (note the version specified by our pyproject is gradio 3.34)
I have no idea how to change this part of the huggingface build section, it looks like a dockerfile but I don't see one anywhere, maybe I'm missing something.