Constellation-Labs / constellation

:milky_way::satellite: Decentralized Application Integration Platform
Apache License 2.0
153 stars 40 forks source link

Node cannot rejoin, even if seen as offline by other nodes #1453

Open tbekas opened 3 years ago

tbekas commented 3 years ago

Can be tested with DownloadSpec if we remove the clausule:

eventually(Timeout(1 hour)) {
      getClusterNodesState shouldBe empty
    }

Then the test occasionally fails (~10% of the time), due to a node getting 401 from some other peers.

marcinwadon commented 3 years ago

What is the default timeout there? I remember I tested this functionality and we are using it currently on mainnet, so I'm pretty sure it must work correctly. Does the code always waits till getClusterNodesState returns an empty list? You can see 401 Unauthorized only and only if the node didn't fully left the cluster or cannot join again because of some reason.

tbekas commented 3 years ago

Maybe there's a faster way of doing this than waiting for an empty list. Anyway, it still will be a workaround, so I would suggest spending time on fixing the bug instead of finding a better workaround.