fedimint / fedimint

Federated E-Cash Mint
https://fedimint.org/
MIT License
536 stars 209 forks source link

LN Gateways - The christmas version #1168

Closed okjodom closed 1 year ago

okjodom commented 1 year ago

A proof-of-concept refactor of the gateway to adopt RPC interfacing between the gateway webserver binary (gatewayd) and it's lightning node

Proves spec proposal in #1143 by:

Illustrations

  1. Gateway Pay invoice Flow Screenshot from 2022-12-30 15-21-28

  2. Gateway Receive Payment Flow Screenshot from 2022-12-30 15-22-52

  3. Component: Gateway Lightning RPC Client #1198 Screenshot from 2023-01-07 01-29-52

  4. Component: Gateway Lightning RPC Server (CLN extension service) #1224 Screenshot from 2023-01-06 11-35-46

WIP

justinmoon commented 1 year ago

Why have 2 separate binaries for lnd / cln? Why not just have 1 binary that can speak both CLN & LND RPC?

Also, it doesn't seem ideal to require LND gateways to run 2 binaries. Christian Decker suggested we not run the actual gateway code in the plugin for the following reasons but I don't think this applies to LND at all.

elsirion commented 1 year ago

Also, it doesn't seem ideal to require LND gateways to run 2 binaries. Christian Decker suggested we not run the actual gateway code in the plugin for the following reasons but I don't think this applies to LND at all.

I think what we want regarding the GW are these components (modulo names being different):

justinmoon commented 1 year ago

cln-gateway-plugin: Exposing the RPC interface necessary to intercept HTLCs and make payments. Possibly a subset of the lnd RPC interface.

Why should this know how to make payments? Why can't gatewayd just have a ClnRpc reference and make the payments itself?

From my POV, having this component do the RPC with CLN just adds more mis-direction when trying to understand the code, is another server address in the configuration that can be wrong, and now outgoing payments don't work if cln plugin somehow stops running but lightningd doesn't -- but if gatewayd pays invoices then outgoing payments would still work under these circumstances.

But I guess the benefit of having cln-rpc do the RPC is that you can run gatewayd and lightningd on separate hosts? Because IIRC CLN RPC happens over a unix socket ... but this seems like extremely minor benefit for me.

justinmoon commented 1 year ago

Also cln-rpc is slightly confusing name for plugin because we have a dependency with same name

okjodom commented 1 year ago

Why have 2 separate binaries for lnd / cln? Why not just have 1 binary that can speak both CLN & LND RPC?

This poses a scaling and upgrade challenge in that binary codebase. By your posposal, it should be able to speak CLN, LND, Eclair and so on. If we needed to bump version in service to a CLN feature upgrade / bug fix, suddenly all other node runners are impacted

In contrast, multiple binaries with known RPC interface (defined as proto spec as GatewayLightning) mean someone could cobble together a binary of their own, maybe it is a plugin, and run it on their node. Then connect this to our formal gatewayd to serve a federation

okjodom commented 1 year ago

Also cln-rpc is slightly confusing name for plugin because we have a dependency with same name

Ref #1224 . I like cln-gateway-plugin name @elsirion has just proposed. Alternative name could be gateway-cln-extension? versus gateway-lnd-extension etc

okjodom commented 1 year ago

Also, it doesn't seem ideal to require LND gateways to run 2 binaries. Christian Decker suggested we not run the actual gateway code in the plugin for the following reasons but I don't think this applies to LND at all.

I think what we want regarding the GW are these components (modulo names being different):

* `gatewayd`: The actual GW business logic, connecting to either `cln-gateway-plugin` or `lnd` via RPC

* `cln-gateway-plugin`: Exposing the RPC interface necessary to intercept HTLCs and make payments. Possibly a subset of the `lnd` RPC interface.

* `gateway-cli`: Management tool for `gatewayd`

If my current proposal flies, LND gateway operators need to think about three binaries: gatewayd, gateway-lnd-extension and wll-known lnd binaries. They need to think about two binaries at minimum, since LND does'nt have a direct plugin model, thus require extensions with rpc interfaces into lnd.

Justification of a gatewayd separate from gateway-lnd-extension or gateway-cln-extension is so gatewayd can remain node implementation agnostic. In CLN case, we get some resiliency benefits highlighted by this reasoning

justinmoon commented 1 year ago

This poses a scaling and upgrade challenge in that binary codebase. By your posposal, it should be able to speak CLN, LND, Eclair and so on. If we needed to bump version in service to a CLN feature upgrade / bug fix, suddenly all other node runners are impacted

If a CLN issue is fixed and released but a given user is running LND I think they can just ignore the release, right?

Also, we might be able to have cln / lnd / eclair etc features which could compile out any un-needed dependencies.

justinmoon commented 1 year ago

In contrast, multiple binaries with known RPC interface (defined as proto spec as GatewayLightning) mean someone could cobble together a binary of their own, maybe it is a plugin, and run it on their node. Then connect this to our formal gatewayd to serve a federation

I'm also not sure this will scenario will arise very often. New lightning node implementations (which we wouldn't have support for) don't show up very often.

justinmoon commented 1 year ago

Another thing your proposal supports which mine doesn't is having a gatewayd instance using CLN & LND at the same time! Also you could have different versions of a node running concurrently by using different versions of gateway-<impl>-extension. That would be very useful for upgrading multi-node setups with no downtime.

okjodom commented 1 year ago

Idea: @justinmoon, @elsirion, perhaps this design naturally opens up a progression for gatewayd to move closer to consensus. As a future plugin, could it implement new features like fulfillment of bolt 12 offers with some sort of custody scheme backed by federation consensus?

For now, we can focus on the concerns raised about having multiple binaries, and in my opinion we should mitigate those via deployment mechanisms like

cc @dpc, @m1sterc001guy, @jkitman . The discussion thread staring here might help build full context

justinmoon commented 1 year ago

perhaps this design naturally opens up a progression for gatewayd to move closer to consensus. As a future plugin, could it implement new features like fulfillment of bolt 12 offers with some sort of custody scheme backed by federation consensus?

My view of things is that we want to have as little running in consensus as possible. ln module is that, and gateway is the lightning node software that interacts with that module.

For BOLT12 (https://github.com/fedimint/fedimint/issues/559), the ln module would need a way to create or upload a BOLT12 offer and have threshold-encrypted preimages uploaded to it. It would also need a way to schnorr sign invoices (assuming federation signs invoice and not gateway). IIRC BOLT12 invoices are fetched from lightning nodes via lightning p2p network, so the gateway would need a way to intercept these p2p messages (like we do with HTLCs currently) and fetch invoice from federation if needed. But I don't think there's any benefit of having the gateway run as a federation module if it doesn't require consensus. When it does need consensus, just communicate with the ln module via its API.

justinmoon commented 1 year ago

Some more quesitons:

okjodom commented 1 year ago

Thanks for the early reviews and discussions on this POC. Since we agreed on taking this design further, I will cherry pick useful parts and make atomic PRs