Closed rjalexa closed 1 month ago
hi @rjalexa could you open a ticket against support with reproducible details like model / query? https://www.together.ai/contact
this doesn't seem like a library issue.
In alternative or additionally could the chat.completion function accept a timeout parameter ?
the client does accept a timeout parameter! https://github.com/togethercomputer/together-python/blob/c733d5bbc71ed22118c8a80d8a778292b0fca54e/src/together/client.py#L31
Sometimes unpredictably the python client hangs for long times without any clue. I think it might be because of rate limits.
Could the client immediately return with such error code if the rate limit has been hit so that we could implement throttling? Could it maybe also return a body with info such as time needed to wait or token limits being hit?
In alternative or additionally could the chat.completion function accept a timeout parameter ?