Closed nonfungible-human closed 6 months ago
Hi @nonfungible-human, thank you so much for the PR! We'll take a look at it and get back to you early next week with next steps. We're evaluating whether it makes sense to do a broader fix here :) It could make sense for us to loop you in on that if you're interested. We'll have more details to share early next week!
@levmi Okay. Earlier, I changed this PR to a draft because one of the ci tests fails as it is checking for the "incorrect password" message. The test that says it tests for an incorrect password really just returns an error if the node is not available. If the decision is made to move forward with this, that test would need to check for a different message. Also, in integrated mode a different message is returned when the node is locked.
So, we discussed this internally. First off, thank you so much for your desire and ability to contribute! We’re really excited to have an external contributor pushing on some of these things.
Our ideal path forward on this issue would be to include not only the changes that you’ve created here. But also, to take advantage of the new SubServer Status rpc to return even more verbose error messages to help the user and us debug any issues. To do that, you would hit the RPC call used in this function to attain the subserver status (i.e. (this._store.api.lit.listSubServerStatus()
), https://github.com/lightninglabs/lightning-terminal/blob/f2cfca981c893152243d22a568689b624d324709/app/src/store/stores/subServerStore.ts#L39
Specifically, here is an example code snippet of what you’ll receive from the RPC call:
"sub_servers": {
"accounts": {
"disabled": false,
"running": true,
"error": ""
},
"faraday": {
"disabled": false,
"running": true,
"error": ""
},
"lit": {
"disabled": false,
"running": true,
"error": ""
},
"lnd": {
"disabled": false,
"running": true,
"error": ""
},
"loop": {
"disabled": false,
"running": true,
"error": ""
},
"pool": {
"disabled": false,
"running": true,
"error": ""
},
"taproot-assets": {
"disabled": false,
"running": true,
"error": ""
}
}
}
The above snippet can return things that are relevant to the user’s inability to login or other unrelated issues. We believe that the flow we’d want is to return errors from the subserver status function if they are related to either the lnd
subserver not starting or the lit
service itself not starting. So, specifically, we’d return errors if the output within lit
or lnd
is either an error message, in which case we would return that error message. Or, if Running: false
, in either of those buckets, then we’d return an error of “LND [lit] is not running.” If the RPC calls fails entirely, then we should return an error of “Unable to connect to LiT. Please restart litd and try again.”
Other error messages from that function (like poold
not starting for instance) would not be relevant to user being unable to log in. So, in the case that there are no errors from lnd
or litd
, we could return the error messages that you’ve already outlined in this existing PR.
Does that make sense? Do you feel comfortable with implementing that type of flow?
Yes, that seems pretty straight forward, and a good way to proceed. I will change it to use the listSubServerStatus function and then modify this PR, possibly by tomorrow but more likely early next week.
Awesome, thank you so much! Don't hesitate to reach out with any questions!
@levmi I do have a question. I ran some initial tests for this and some of the error messages it returns are pretty long and seem to contain multiple messages. Should the entire message be display on the user interface in all of these cases?
I tested 3 common error conditions (wrong password, node locked, node not running) in both remote and integrated mode. The following results are what I receive for "lit" and "lnd" when calling listSubServerStatus
.
The error message for lit could be used here, but it is rather long and somewhat ambiguous. For example, the only part of the message that indicates that the problem is an incorrect password is could not start macaroon service
.
"lit"
{
disabled: false
error: "could not start Lit: could not start litd sub-servers: could not start macaroon service: unable to derive a shared secret with LND: rpc error: code = Unknown desc = permission denied"
running: false
}
"lnd"
{
disabled: false
error: ""
running: true
}
Neither one has an error message, but lit.running == false
. Therefore, we would use the generic error message LiT is not running
.
"lit"
{
disabled: false
error: ""
running: false
}
"lnd"
{
disabled: false
error: ""
running: true
}
Both lit and lnd have an error message. The message for lnd is pretty long. The message for lit could be used in this case.
"lit"
{
disabled: false
error: "could not start Lit: could not start LND"
running: false
}
"lnd"
{
disabled: false
error: "could not set up LND clients: could not create LND Services client: error subscribing to lnd wallet state: lnd version incompatible, need at least v0.13.0-beta, got error on state subscription: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:10009: connect: connection refused\""
running: false
}
In this case, we would use the generic message LiT is not running
.
"lit"
{
disabled: false
error: ""
running: false
}
"lnd"
{
disabled: false
error: ""
running: true
}
In this case, we would use the generic message LiT is not running
.
"lit"
{
disabled: false
error: ""
running: false
}
"lnd"
{
disabled: false
error: ""
running: true
}
This case was not tested.
Hiya - doing some testing on our side and then will get back to you with a response on the above questions! I think what we want to make sure of is that if we're going to change the error messages in the UX to be more descriptive of the problem from an end user perspective that they aren't incorrect in certain cases. For instance, in the first case, with returning could not start macaroon service
, it would obviously be better to actually return an error of wrong password
or something along those lines. But, we wouldn't want to return that in the case where there's a separate error from wrong password
that also returns could not start macaroon service
. We're doing some tests on our side and then can validate or make recommendations. Sorry for the delay, it's a busy week for us with roundtables where we gather customer feedback. Reverting either tomorrow or Friday :)
@levmi Okay, sounds great. Those were exactly my concerns too, that we always use a reliable message to display something the user can easily understand. I will wait until I hear back from you.
Hi @nonfungible-human! First of all, once again, thanks a lot for taking the time to contributing to this repo, it's really appreciated :pray::fire:!!!
I'll detail an answer to your questions one by one below.
First of all, I think I need to clarify that we would like to use both the error messages that you've currently added, plus the response from the subserver status
rpc response. I think it would make sense create a combination of both showing the error messages that this PR has already added, and then showing the response from the subserver status
rpc call separately. The response from the status
rpc call should be seen as an "advanced error" message view, and would probably make sense to hide under a "More details" dropdown that can be expanded underneath the error message that's currently shown. For the average user, the expanded set of error messages that this PR has already added, will likely be good enough, but for advanced users the output from the status
rpc response will also provide helpful troubleshooting info!
Too see what I mean by a dropdown, look at how extra information is shown when clicking the "Additional Options" dropdown on: https://terminal.lightning.engineering/connect/pair/
I'm also cc:ing @jamaljsr here, as UI:s isn't my specialty :sweat_smile:, so I'd like to get his approval if this sounds like a good idea before you actually implement it.
Either way, I definitely think we should create a combination of both the current error messages and the status
rpc response, as neither of them can replace the need for the other.
The error message for lit could be used here, but it is rather long and somewhat ambiguous. For example, the only part of the message that indicates that the problem is an incorrect password is could not start macaroon service.
If we create the "advanced error message view" I think think it's ok that you include the full error message here.
Just to clarify once more though, if we create a combination of both the current error messages and the status
rpc response, the user will still get an "Incorrect password" in the main error message view to show the user what's wrong.
Just for future reference, as it might be helpful for you to understand our error messages more easily, when you see long nested error messages like this, it's actually most often the last part of the error message that most accurate. In this case it's permission denied
, which correlates well to the password being incorrect.
Yes, in cases like this it'd make sense to show the LiT is not running
in the "advanced error message" view. Actually though, just to make the LiT is not running
more helpful when we have no message in the error field, let's modify it to:
LiT is not running - Ensure that the wallet is unlocked
Finally, in cases like this, you should also hit the walletLockedErr
as the main error message right? That should help the user figure out what's wrong.
If we add the "advanced error message" view, I actually think it makes sense to show the full error messages for both lit
& lnd
for this and other cases when there are multiple error messages. Just make sure that it's clear which error comes from which system.
This case actually shows a case where the "advanced error message" view will be very helpful for users, as I believe you'll hit the "noConnectionErr" for the main error message view, right?
See the answer for "remote mode, with lnd node running and unlocked, wrong password entered" :).
See the answer for "remote mode, with lnd node running but locked, correct password entered".
This will lead to the status
rpc call response containing "could not start Lit: could not start LND" for the lit
error field.
Let me know if something is still unclear!
@ViktorTigerstrom thanks for all of this great feedback. There's really not much to add.
I agree that the status message should be shown alongside the custom RPC error. The status error just provides more context into what may be causing the login to fail.
Another approach for placement of the status message is to use an icon with a tooltip on hover.
@ViktorTigerstrom and @jamaljsr, I have pushed a new commit to this branch and it works for me in all of the above cases. The css styles and error message text can easily be modified if necessary.
I did notice, during my final testing, that I was getting some "unable to fetch subserver status" error messages. I don't know if these messages are specific to my setup, because I am not using any of the other subservers, or if they are caused by something else. One of them said it could not find lit.macaroon, which is an error I thought I had seen in the past. None of the code I changed should affect the way subserver statuses are retrieved, so I am assuming the messages I am seeing are a normal condition. But we should make sure that, in addition to displaying accurate error messages, all functionality still works after a successful login, and that is hard for me to do because I am currently testing this login code on an unfunded node with no channels. Specifically, the messages I am talking about are the red boxes that appear on the right side of the screen immediately after a successful login. I see several of them and most of them say "unable to fetch subserver status" but they disappear faster than I can read all of them.
I made the suggested changes. I wasn't able to reproduce the empty detail message when an incorrect password is entered, but I put a check in place to prevent the button from being displayed if both detail messages are empty. However, at least one of them should contain text if there is an error.
Also, I removed the redundant prefixes Lit:
and LND:
. @ViktorTigerstrom had said to make sure that it's clear which error comes from which system
, which is why I put them in there in the first place. Right now, if there is a message from both LiT and LND, it will display them both and it seems fairly easy to tell them apart. The case I know of where it display a message from both subsystems is when LiT is running in remote mode and LND is not running at all.
This PR is related to Issue 655.
When the user is logging into the web interface, an exception is caught if the login fails. However, the exception message is ignored and the "incorrect password" error message is displayed to the user in all cases. The result is that, unless the user really did enter the wrong password, the error message will be incorrect.
This error can be caused by attempting to connect to a node that is not running or is in the locked state. Even if the correct password is entered, the error message says it is incorrect. These conditions can be identified more accurately than they currently are and should be reported to the user more accurately as well.
This PR will add a error detail message for both LiT and LND, which can be toggled. So there will now be as many as three error message: a main error message (always displayed) and LiT and LND detail error messages, at least one of which will be displayed.
The login error messages will be determined as follows:
If the subserver statuses could not be obtained, it will report a connection failure with detail in the LiT detail message.
If none of the above conditions are met, but the login still failed, it will do the following:
This will result in more accurate error messages being displayed on the web interface, which would make it easier for some users to diagnose login problems on their own. In cases where they still need assistance in diagnosing a problem, the more accurate error message will help to narrow down the cause earlier in the process.