Tribler / tribler

Privacy enhanced BitTorrent client with P2P content discovery
https://www.tribler.org
GNU General Public License v3.0
4.74k stars 445 forks source link

[7.13.3] Investigating reasons for CoreConnectTimeoutError #7956

Open kozlovsky opened 3 months ago

kozlovsky commented 3 months ago

CoreConnectTimeoutError is our most frequent error. According to Sentry, there are many different reasons for this error. We have eliminated many of them, but there are still many remaining reasons. Eliminating, say, 90% of them should be a significant improvement. I want to list most of the possible reasons for this issue. Then, if necessary, we can add separate sub-issues for some of them.

We can split CoreConnectTimeoutError cases into two categories.

We can remedy the problem in the first case by increasing the timeout time. While it is preferable to connect with the GUI as quickly as possible, it is better to connect with a delay than to have a CoreConnectTimeoutError. For that reason, I suggest increasing the timeout to a longer time, say, 240 seconds.

Cases when Core continues to work till the timeout

Cases when Core completely "freezes"


In my opinion, as a minimum change, we should significantly increase the timeout value and enable slow coroutine stack tracing by default for the binary build. Enabling stack tracing should slow down Core code a bit, but it is not too visible for users (as UI works at the same speed as before) and is crucial to discovering the actual reason for freezes.