Closed okjodom closed 1 year ago
Why have 2 separate binaries for lnd / cln? Why not just have 1 binary that can speak both CLN & LND RPC?
Also, it doesn't seem ideal to require LND gateways to run 2 binaries. Christian Decker suggested we not run the actual gateway code in the plugin for the following reasons but I don't think this applies to LND at all.
Also, it doesn't seem ideal to require LND gateways to run 2 binaries. Christian Decker suggested we not run the actual gateway code in the plugin for the following reasons but I don't think this applies to LND at all.
I think what we want regarding the GW are these components (modulo names being different):
gatewayd
: The actual GW business logic, connecting to either cln-gateway-plugin
or lnd
via RPCcln-gateway-plugin
: Exposing the RPC interface necessary to intercept HTLCs and make payments. Possibly a subset of the lnd
RPC interface.gateway-cli
: Management tool for gatewayd
cln-gateway-plugin
: Exposing the RPC interface necessary to intercept HTLCs and make payments. Possibly a subset of the lnd RPC interface.
Why should this know how to make payments? Why can't gatewayd
just have a ClnRpc
reference and make the payments itself?
From my POV, having this component do the RPC with CLN just adds more mis-direction when trying to understand the code, is another server address in the configuration that can be wrong, and now outgoing payments don't work if cln plugin somehow stops running but lightningd doesn't -- but if gatewayd
pays invoices then outgoing payments would still work under these circumstances.
But I guess the benefit of having cln-rpc
do the RPC is that you can run gatewayd
and lightningd
on separate hosts? Because IIRC CLN RPC happens over a unix socket ... but this seems like extremely minor benefit for me.
Also cln-rpc
is slightly confusing name for plugin because we have a dependency with same name
Why have 2 separate binaries for lnd / cln? Why not just have 1 binary that can speak both CLN & LND RPC?
This poses a scaling and upgrade challenge in that binary codebase. By your posposal, it should be able to speak CLN, LND, Eclair and so on. If we needed to bump version in service to a CLN feature upgrade / bug fix, suddenly all other node runners are impacted
In contrast, multiple binaries with known RPC interface (defined as proto spec as GatewayLightning
) mean someone could cobble together a binary of their own, maybe it is a plugin, and run it on their node. Then connect this to our formal gatewayd
to serve a federation
Also
cln-rpc
is slightly confusing name for plugin because we have a dependency with same name
Ref #1224 . I like cln-gateway-plugin
name @elsirion has just proposed. Alternative name could be gateway-cln-extension
? versus gateway-lnd-extension
etc
Also, it doesn't seem ideal to require LND gateways to run 2 binaries. Christian Decker suggested we not run the actual gateway code in the plugin for the following reasons but I don't think this applies to LND at all.
I think what we want regarding the GW are these components (modulo names being different):
* `gatewayd`: The actual GW business logic, connecting to either `cln-gateway-plugin` or `lnd` via RPC * `cln-gateway-plugin`: Exposing the RPC interface necessary to intercept HTLCs and make payments. Possibly a subset of the `lnd` RPC interface. * `gateway-cli`: Management tool for `gatewayd`
If my current proposal flies, LND gateway operators need to think about three binaries: gatewayd
, gateway-lnd-extension
and wll-known lnd
binaries. They need to think about two binaries at minimum, since LND does'nt have a direct plugin model, thus require extensions with rpc interfaces into lnd
.
Justification of a gatewayd
separate from gateway-lnd-extension
or gateway-cln-extension
is so gatewayd
can remain node implementation agnostic.
In CLN case, we get some resiliency benefits highlighted by this reasoning
This poses a scaling and upgrade challenge in that binary codebase. By your posposal, it should be able to speak CLN, LND, Eclair and so on. If we needed to bump version in service to a CLN feature upgrade / bug fix, suddenly all other node runners are impacted
If a CLN issue is fixed and released but a given user is running LND I think they can just ignore the release, right?
Also, we might be able to have cln
/ lnd
/ eclair
etc features which could compile out any un-needed dependencies.
In contrast, multiple binaries with known RPC interface (defined as proto spec as GatewayLightning) mean someone could cobble together a binary of their own, maybe it is a plugin, and run it on their node. Then connect this to our formal gatewayd to serve a federation
I'm also not sure this will scenario will arise very often. New lightning node implementations (which we wouldn't have support for) don't show up very often.
Another thing your proposal supports which mine doesn't is having a gatewayd
instance using CLN & LND at the same time! Also you could have different versions of a node running concurrently by using different versions of gateway-<impl>-extension
. That would be very useful for upgrading multi-node setups with no downtime.
Idea: @justinmoon, @elsirion, perhaps this design naturally opens up a progression for gatewayd
to move closer to consensus. As a future plugin, could it implement new features like fulfillment of bolt 12 offers with some sort of custody scheme backed by federation consensus?
For now, we can focus on the concerns raised about having multiple binaries, and in my opinion we should mitigate those via deployment mechanisms like
gatewayd
and gateway-<impl> extension
gatewayd
for fedimint guardians who want to operate gateways, so they only think about gateway-<impl> extension
and beyondcc @dpc, @m1sterc001guy, @jkitman . The discussion thread staring here might help build full context
perhaps this design naturally opens up a progression for gatewayd to move closer to consensus. As a future plugin, could it implement new features like fulfillment of bolt 12 offers with some sort of custody scheme backed by federation consensus?
My view of things is that we want to have as little running in consensus as possible. ln
module is that, and gateway
is the lightning node software that interacts with that module.
For BOLT12 (https://github.com/fedimint/fedimint/issues/559), the ln
module would need a way to create or upload a BOLT12 offer and have threshold-encrypted preimages uploaded to it. It would also need a way to schnorr sign invoices (assuming federation signs invoice and not gateway). IIRC BOLT12 invoices are fetched from lightning nodes via lightning p2p network, so the gateway would need a way to intercept these p2p messages (like we do with HTLCs currently) and fetch invoice from federation if needed. But I don't think there's any benefit of having the gateway run as a federation module if it doesn't require consensus. When it does need consensus, just communicate with the ln
module via its API.
Some more quesitons:
extension
restarts?gatewayd
restarts?lnd
restarts?Thanks for the early reviews and discussions on this POC. Since we agreed on taking this design further, I will cherry pick useful parts and make atomic PRs
A proof-of-concept refactor of the gateway to adopt RPC interfacing between the gateway webserver binary (gatewayd) and it's lightning node
Proves spec proposal in #1143 by:
GatewayLightning
for LND from the proto3 spec. This is packaged aslnd-rpc
binaryGatewayLightning
from the proto3 spec. This is packaged ascln-rpc
binaryLnGateway
. This can call into rpc servers for either node implementationIllustrations
Gateway Pay invoice Flow
Gateway Receive Payment Flow
Component: Gateway Lightning RPC Client #1198
Component: Gateway Lightning RPC Server (CLN extension service) #1224
WIP