Open dstadulis opened 3 weeks ago
Thanks for the issue report! Should we allow any rpc requests to the tapd
before the universe sub-service has been started?
If they should be separated, so that we would want allow some requests prior to the universe sub-service, would an ok first version be to just disallow any tapd
RPCs until the universe sub-service is up, and then create the separation at a later stage?
I don't think they should be separated, or that there would be any benefit to that.
IIUC the desired fix is to disallow all tapd
RPCs until all tapd
sub-services are up.
LiT today should already take care of this. We only mark a subserver as "ready for calls" once the Start
(for tap, this is the (s *Server) StartAsSubserver
method) method of that subserver returns without an error. We want to keep lit as generic as possible in terms of handing of subservers and so it is expected for each subserver that it is ready to handle calls once that start method has returned and we really should not be doing extra calls to check if various subserver specifics are ready - the SLA here is that things should be ready once that method has returned.
If yall dont want to permanently change the behaviour, then I suggest adding a "BlockTillUniverseStart" functional option or something on that call
Hmm, I see. That is the correct behavior as far as I can see. We only have a screenshot of a stack trace to go from and it looked like an RPC request did come through before the internal gRPC server in tapd
was started.
But perhaps tapd
stopped with an error and the request came through after the gRPC server was stopped?
Though from just the code it looks like that's handled correctly as well.
Need more info from the user then, preferably actual logs.
Reposting the stack trace as text:
So today we have a middleware interceptor that'll bounce all calls until the server is fully started: https://github.com/lightninglabs/taproot-assets/blob/72b93f84a0afa08e01c99d582357672133fc9b20/rpcperms/interceptor.go#L380-L408
We then set to active after all the sub-systems have started here: https://github.com/lightninglabs/taproot-assets/blob/72b93f84a0afa08e01c99d582357672133fc9b20/server.go#L374-L378
From the trace, either the rpcserver
or FederationDB
was nil
at that point, which is puzzling (all the sub-system structs have already been initialized at that point). So perhaps some mutation occurred somewhere?
This issue is coming from our team. When the terminal is restarted while lnd and tapd are still in the process of syncing, the terminal crashes continuously due to receiving various requests from tapd (such as assets list, subscribeRfqEvents, universe).
@lukegao209 could you send us a full log (including the stack trace) from the beginning of a startup sequence until it panics?
@lukegao209 also, what port are you issuing the ListFederationServers
call on? The main litd
port (8443) or the integrated tapd
port (10029)?
@lukegao209 also, what port are you issuing the
ListFederationServers
call on? The mainlitd
port (8443) or the integratedtapd
port (10029)?
port 10029
@lukegao209 - ok cool thanks 🙏 This makes sense then - rather point your requests to Lit's port 8443 as then LiT will block calls to tapd until it is ready.
@Roasbeef re
So today we have a middleware interceptor that'll bounce all calls until the server is fully started:
This is actually not the case for the Subserver startup of Tap.
RunUntilShutdown
which sets up the interceptor chain for Tap which does this blockingStartAsSubserver
which LiT uses , the interceptor chain in Tap is not set upIn conclusion, for the terminal, it should request lnd on port 10009; for other services (tapd, loop, pool, etc.), they should all request port 8443. ?
you can also point LND requests to 8443 👍
thanks
btw , will terminal’s account support taproot assets channel?
btw , will terminal’s account support taproot assets channel?
not out of the box, no. we'll need to update the litd
account system once taproot asset channels are fully implemented.
If Taproot Assets are implemented through parsing the invoice’s custom_data, maybe I can start working on supporting it now.
@ellemouton
but for StartAsSubserver which LiT uses , the interceptor chain in Tap is not set up
Gotcha, ok that seems to be the core issue here. We should retain the interceptor chain for tapd
, either using the same system, or a more explicit check.
@guggero - with our latest offline discussion, do you rate we can close this and move it to the Tap repo?
Yes. TODOs are:
Transferring to tapd
repo.
In a recent user report, litd dysfunction was shown because the universe RPC call ran before the universe sub service was online.
See TODO summary here: https://github.com/lightninglabs/taproot-assets/issues/1122#issuecomment-2345928938