lightninglabs / lightning-terminal

Lightning Terminal: Your Home for Lightning Liquidity
MIT License
488 stars 82 forks source link

multi: restart LND Client on setup failure #694

Closed ViktorTigerstrom closed 5 months ago

ViktorTigerstrom commented 6 months ago

This PR adds the functionality that in case we fail to setup the LND client(s) when starting litd, we will continuously retry to connect the LND clients until we succeed to do so. This partly addresses #683, but currently only restarts the LND clients when failing to setting them up, but not in case we fail to start lnd. The lit sub-server will only be set to status Running after the lnd clients have successfully been set up. This hopefully addresses most of the issues we've had from users that have experienced issues with the lnd sub-server erroring on the start up of litd, but likely not all of the errors.

A note to reviewers:

  1. Currently when failing to set up and therefore now retrying to create the lnd client(s), the litcli getinfo + the litcli stop commands are blocked and are not allowed, as we currently require macaroon authentication for those calls. Should I update this PR to allow those calls by adding them to the LiT Whitelist?
  2. To simplify the local testing of this PR, I've created a branch that very simply mocks 3 failures to set up the basic lnd client as well as 3 failures to set up the full lnd client. The full lnd client set up attempts will also fail after 30 seconds, as that's what users experience when their set up fails due to the lnrpc.GetInfo call context times out. You can use that branch to test this PR locally: https://github.com/ViktorTigerstrom/lightning-terminal/tree/2023-12-restart-lndclient-on-failure-mocked-lndclient-fails

Happy to address any feedback!

ViktorTigerstrom commented 6 months ago

Thanks a lot for the feedback and for reviewing this so quickly @ellemouton! I've addressed your feedback with the latest push.

ViktorTigerstrom commented 6 months ago

Thanks again for reviewing this PR so quickly @ellemouton :rocket::fire:!!! I've addressed your latest feedback with the latest push :)

lightninglabs-deploy commented 6 months ago

@ellemouton: review reminder @viktortigerstrom, remember to re-request review from reviewers when ready

ViktorTigerstrom commented 6 months ago

Thanks a lot for the review @ellemouton :tada:!!

I've addressed your feedback with the latest push!

Btw - think you need to update your lndclient test branch as the current redirect commit doesn't compile.

I've now updated the branch so that it builds and can be used for testing! One thing to note though, Is that I couldn't get it to work with lndclient v0.17.0-4 for some reason. There seems to be some conflict with taproot-assets that I just couldn't figure out how to solve within a reasonable time. So In the test branch I've downgraded to v0.17.0-3, which subsequently downgraded taproot-assets as well. But it can be used to test the functionality of this PR!

saw this getting logged: Setting the lit sub-server as errored: Error %d when creating LND Services client: %v - gotta make sure that the values are populated in the log.

Thanks! This was a pre-existing error that I've now addressed with 3b83d6e1022ce1ee5e99a929d43c88430dec1de6 :)

this is probably not related to this PR but I get this error for taproot-assets:

I hadn't seen this, but as I started downgrading the lndclient and therefore the taproot-assets version to make the test branch compile, and then upgraded again and such, I actually managed to trigger this as well! Did you do something similar before the error occurred for you as well? However, after I removed testnet folder (my configured netwok) in the taproot-assets Application data, and restarted litd again, the error disappeared. I suspect it's some migration thing in taproot-assets going on. But as my way of making the error get triggered is probably nothing a normal user will do, I'm not sure if this is something we should actually report. Though it would be really good to get @guggero's opinion on this as well!

ViktorTigerstrom commented 5 months ago

Thanks a lot for the review @guggero :tada::fire:!

Not exactly sure what this is referring to? Because the current version of the PR doesn't modify the lndclient version in the go.mod file, so it's currently at v0.17.0-4. What test branch of lndclient were you talking about?

Ah sorry, I should have been more clear what I was referring to. This is referring to the test branch to test the restart functionality locally, that I mentioned in the original description message of this PR above:
https://github.com/ViktorTigerstrom/lightning-terminal/tree/2023-12-restart-lndclient-on-failure-mocked-lndclient-fails

Yes, that's very likely it. If the version of taproot assets was changed between versions of PRs, a downgrade would explain the error. So all good I think.

Ok great, sounds good :rocket:!