Closed psalin closed 2 years ago
@psalin Thanks. I don't see you on Slack. If you are can you ping me there? I haven't heard back from Kenneth and was hoping we could talk. Or send me an email.
The fix idea in #91 is based on setting a timeout to the gen_statem:call. Not that pretty a solution but seems able to the prevent this issue.
For unary requests, when the HTTP/2 connection disconnects, the next request will make grpcbox_subchannel:conn() try to reconnect it. Meanwhile all other requests sent to the same subchannel will block behind grpcbox_subchannel:conn() which is blocking until success or connect_timeout. If the HTTP/2 connect keeps timing out, the requests in the queue will take very long to return.
In our case we would have alternative channels that could have been used if requests always timed out within a specified time. Any ideas on how requests getting stuck in this situation could be prevented?