Open botanegg opened 1 year ago
After a lot of tests of https://github.com/mumble-voip/libmumble/pull/8 PR
I found a reproduce steps (do it on same machine): 1) run libmumble based client and connect to server 2) open mumble-client 3) open server list, add same server to favorites 4) close server list 5) open server list 6) repeat 4 and 5 over and over again (on my machine 5-6 times is enough) 7) libmumble based client will return shutdown code in next line https://github.com/mumble-voip/libmumble/blob/master/src/TLS.cpp#L188
ps. sometimes you need to connect once to server via mumble-client before open-close loop
I think it is another bug besides EAGAIN handle
Also
High ping to server - reduces reproducibility
Set threads to 1 in code = client.startTCP(peerFeedback(), 1);
- reduces reproducibility
Thank you very much for the detailed report!
I would expect OpenSSL to handle EAGAIN
, but according to your findings it doesn't seem to be the case.
If your pull request effectively fixes the issue, I'm going to merge it right away.
pr #8 just reduces reproducibility but not completely solve the problem
I think it can be merged but it not solve all the problems in this place
OK
Next point is thread safety
Seems like SSL objects can be used via multiple threads without any mutex
When we used this way we got "Requesting crypt-nonce resync" in server logs, and then drop the connection
Now that both #8 and #9 were merged, are there any issues left (that you discovered)?
I've test library with mumble-server based on mumblevoip/mumble-server:v1.4.287
After I have connected (and sent Ping messages to server) I got an SSL_ERROR_SYSCALL in TLS.cpp and then got Shutdown and closed TCP. You need to be connected a lot of time (from 10 minutes to 10 hours) to reproduce https://github.com/mumble-voip/libmumble/blob/master/src/TLS.cpp#L184-L190
top of stacktrace:
I check
ERR_get_error()
is0
anderrno
is11
inside SSL_ERROR_SYSCALL caseAfter my little patch a have no random disconnect anymore
But I'm not sure this is enough and this is idiomatic code for handling ssl problems
Related links from so:
https://stackoverflow.com/questions/13686398/ssl-read-failing-with-ssl-error-syscall-error
https://stackoverflow.com/questions/13554691/errno-11-resource-temporarily-unavailable