Open rkfg opened 1 week ago
[!IMPORTANT]
Review skipped
Auto reviews are limited to specific labels.
Labels to auto review (1)
* llm-reviewPlease check the settings in the CodeRabbit UI or the
.coderabbit.yaml
file in this repository. To trigger a single review, invoke the@coderabbitai review
command.You can disable this status message by setting the
reviews.review_status
tofalse
in the CodeRabbit configuration file.
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Approved CI run.
This is very interesting, some lock-up is triggered if this timer is set to a small value. Couldn't that be a sign of a bigger issue somewhere? If it's set to 10-60 minutes as it is by now then the tests work fine. If I set it to less than 10 minutes it hangs. Could be a race condition between the tests?
So the actual failing test is TestChannelLinkCancelFullCommitment
, the update there blocks it from progressing. If I change the initial 0 to time.Second
it starts running but since it takes a long time to settle all the HTLCs the fee update times out with failing link: unable to complete dance with error: remote unresponsive
(I enabled log output for this test to see what's wrong). Normally that shouldn't be a problem I suppose, if we ever need to settle ≈483 HTLCs and then need to update the fee and it fails because it took us longer than pending-commit-interval
to finish, the link would be dropped after that. Then we should reconnect and continue normally. So one solution would be to fix the test so that the update doesn't get in the way and break the link like this:
n.aliceChannelLink.updateFeeTimer.Reset(time.Minute * 10)
n.bobChannelLink.updateFeeTimer.Reset(time.Minute * 10)
as well as setting the original PR line to l.updateFeeTimer = time.NewTimer(time.Second * 1)
. In this case the test passes and doesn't hang indefinitely. I'll check the other unit tests, if there are too many that needs such a fix maybe it wasn't a good idea in the first place.
Many tests seem to rely on update fee never happening unless explicitly requested. It either breaks their expectations or the link. Since the tests usually complete in under 2 minutes and the default fee update interval is at least 10 minutes those tests have never broken before. Not sure how to fix it reliably, I'd like the fee update to happen in 1-10 seconds after the link is established but then too many tests break so I set it to 1 minute now.
Another way could be patching the mock links like this:
diff --git a/htlcswitch/test_utils.go b/htlcswitch/test_utils.go
index 9a72197ec..70e8e71f0 100644
--- a/htlcswitch/test_utils.go
+++ b/htlcswitch/test_utils.go
@@ -1185,7 +1185,7 @@ func (h *hopNetwork) createChannelLink(server, peer *mockServer,
}
}
}()
-
+ link.(*channelLink).updateFeeTimer.Reset(time.Minute * 10)
return link, nil
}
and then adding a separate test to check for this short interval working as expected. Please tell me what solution is better for the project.
Yes, I tried this before and the tests failed in the same way. Changing the timer period is better in this case because we can delay the fee update if needed. Currently the update period is 10-60 minutes, all tests are faster than that, no test accounts for a random fee update in the middle of the run (in production it can happen of course in a similar scenario). If we make a fee update soon after a channel becomes active, the test gets an unexpected update and misbehaves. Either many tests should be fixed (other issues could be discovered during that but idk tho) or we pretend nothing changed and test the short update period separately.
Change Description
See #8790, PR recreated for several reasons. The fee update timer is initially set to zero to trigger the fee update check right after the channel becomes active, the next update interval will be random (between 10 and 60 minutes) as usual.
Steps to Test
Have a low commitment fee, make the estimated fee high and reconnect the peer. The commitment fee should update.
Pull Request Checklist
Testing
Code Style and Documentation
[skip ci]
in the commit message for small changes.📝 Please see our Contribution Guidelines for further guidance.