Closed adriankirchner closed 4 years ago
Damn, that's how a bugreport should look like! Thank you, we will investigate!
Should be fixed with #282 Can you please try again with the latest dev branch @adriankirchner
If I recall correctly I've never experienced this problem with 0.3.0 running for weeks now. I will report if I see this behaviour in the upcoming 0.4.0.
Thanks!
Describe the bug This bug was observed today on 4 hornet instances. All instances are neighbors to each other. Furthermore all instances were neighbors to
tcp://auto01.manapotion.io:15601
andtcp://auto02.manapotion.io:15602
. After more than 11 hours of regular operation (monitoring shows a constant neighbor count of5
on all instances) the manapotion instances experienced some congestion resulting in hornet constantly trying to reconnect. After around 38 minutes hornet gave up and never tried to reconnect again. This maybe already a bug, I'm not familiar enough with the hornet internals to judge. After 7 hours while missing these two neighbors I attempted to addtcp://auto01.manapotion.io:15601
again:Neighbor wasn't added so I checked the journal:
I expected a stale data situation somewhere in hornet so I tried to remove that neighbor first:
Neighbor wasn't removed and the journal has no entry about that operation. I also tried different uri notations:
tcp://159.69.9.6:15601
159.69.9.6:15601
auto01.manapotion.io:15601
Furthermore the getNeighbors-API-Call doesn't list these two manapotion instances.
Both hostnames
auto01.manapotion.io
andauto02.manapotion.io
resolve to the ip address159.69.9.6
and only differ in port. I could image a bug in origin address parsing or identity building. But maybe this is race condition in pool handling only.To Reproduce See bug description
Expected behavior At first I would expect hornet to try to reconnect after a certain grace period. At second I would expect to be able to remove and add the neighbor in question again.
Environment information:
Additional context
Here is the journal excerpt (congestion starting at 14:48 and reconnecting stops at 15:25), the config.json and the neighbor.json
journalctl --since="2020-01-06 03:38:50" -o short-precise -u hornet | grep -E "(manapotion.io|159.69.9.6)"
config.json
neighbors.json