planetarium / libplanet

Blockchain in C#/.NET for on-chain, decentralized gaming
https://docs.libplanet.io/
GNU Lesser General Public License v2.1
506 stars 141 forks source link

Investigate for `ConsumeBlockCandidates()` crashes #2816

Open longfin opened 1 year ago

longfin commented 1 year ago

Note: we already fixed ConsumeBlockCandidates() crash on 0.46.1, but couldn't pinpoint the cause. So I'm leaving an issue for investigation and follow-up.

  1. On BlockCandidateDownload(), BlockCandidateTable.Add() is called for blocks, fetched by GetBlockAsync(). https://github.com/planetarium/libplanet/blob/ed9ee092c883ba93365316ea520f7da624646932/Libplanet.Net/Swarm.BlockCandidate.cs#L383-L389

  2. GetBlocksAsync() simply returns empty enumeration when has been timeouted. https://github.com/planetarium/libplanet/blob/ed9ee092c883ba93365316ea520f7da624646932/Libplanet.Net/Swarm.cs#L819-L822

As far as checked, I've been able to find timeout exceptions on almost every node that had crashed.

longfin commented 1 year ago

In other logs around the time the symptom occurred, it was found that the NetMQRuntime has been blocked causing a delay of several seconds. I'll cover it as a separate issue.

longfin commented 1 year ago

I opened #2817 to address blockings during hostname resolution.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.