Closed okjodom closed 1 year ago
- Use complete_htlcs RPC to send a request to cancel the HTLC if HTLC processing failed, timed-out or was rejected for some other reason // example let cancel = CompleteHtlcsRequest { action: Some(Action::Cancel(Cancel { reason: "" })) }; let stream = client.complete_htlcs(Request::new(stream::iter(vec![cancel])).await?;
Did you mean cancel_htlcs
instead of complete_htlcs
?
I haven't looked at this PR in particular yet, but could we implement something compatible with LND?
That would avoid needing different proto specs and thus client impls in the GW for LND and CLN.
Even if not possible, we should at least think about how we unify the two proto specs into one rust trait.
I haven't looked at this PR in particular yet, but could we implement something compatible with LND?
Yes, we actually abstract over this LND HtlcInterceptor
RPC, adding some filtering feature so which allows us to subscribe to htlcs for multiple federations.
That would avoid needing different proto specs and thus client impls in the GW for LND and CLN. Even if not possible, we should at least think about how we unify the two proto specs into one rust trait.
Filtering happens server side within the gateway-<lnrpc>-extension
so each implementation does it's own mapping into our own RPC spec defined in this PR
As for the contents of this PR vs previous proposal in #1218 , I'm convinced we need one subscribe_intercept_htlcs
RPC method for intercepting and a corresponding complete_htlcs
RPC method for responding to the node. This separation or intercept and respond RPC methods is because gateway-<lnrpc>-extension
delegates processing of intercepted HTLC over the wire to gatewayd via lnrpc-client
, and it should not block on waiting for a response from the client
In contrast, LND HtlcInterceptor
is a bi-directional streaming RPC, because we can serially manipulate each intercepted htlc and send back responses to LND over a channel. No filtering on the raw interceptor.
Here is an example of intercept and respond to HTLCs built with inspiration from prior art by @justinmoon
- Use complete_htlcs RPC to send a request to cancel the HTLC if HTLC processing failed, timed-out or was rejected for some other reason // example let cancel = CompleteHtlcsRequest { action: Some(Action::Cancel(Cancel { reason: "" })) }; let stream = client.complete_htlcs(Request::new(stream::iter(vec![cancel])).await?;
Did you mean
cancel_htlcs
instead ofcomplete_htlcs
?
complete_htlcs
is the rpc method name. You can send a cancel
or settle
action request over it, specifying the htlc to cancel or settle
Here is a complete implementation of the client: https://github.com/fedimint/fedimint/pull/1198/commits. These two commits for how the lnrpc client uses complete_htlcs
:
This separation or intercept and respond RPC methods is because gateway-
-extension delegates processing of intercepted HTLC over the wire to gatewayd via lnrpc-client, and it should not block on waiting for a response from the client
For CLN with cln_plugin
we have to block, right? We're just inside a function and need to return a value. I guess we could just println
the JSON-RPC response to CLN if we can get our hands on the JSON-RPC request ID. But blocking isn't a problem if we do it in a tokio task so it doesn't block anything else.
Filtering happens server side within the gateway-
-extension so each implementation does it's own mapping into our own RPC spec defined in this PR
Filtering could also happen in gatewayd
, right? Not sure which is better, but strictly speaking it can happen in either one ...
This separation or intercept and respond RPC methods is because gateway--extension delegates processing of intercepted HTLC over the wire to gatewayd via lnrpc-client, and it should not block on waiting for a response from the client
For CLN with
cln_plugin
we have to block, right? We're just inside a function and need to return a value.
Ah yes. CLN implementation has to block! It'd be interesting to implement and interceptor for CLN which can handle this situation:
1. Intercept HTLC A for federation A. Send over to gatewayd for processing
2. Intercept HTLC B for federation B. Send over to gatewayd for processing
3. Receive outcome of HTLC A from gatewayd and respond to CLN pay
4. Receive outcome of HTLC B from gatewayd and respond to CLN pay
Desirable properties of a good interceptor might include:
I guess we could just println the JSON-RPC response to CLN if we can get our hands on the JSON-RPC request ID. But blocking isn't a problem if we do it in a tokio task so it doesn't block anything else.
Trying this out here: https://github.com/fedimint/fedimint/pull/1224/commits
Filtering happens server side within the gateway--extension so each implementation does it's own mapping into our own RPC spec defined in this PR
Filtering could also happen in
gatewayd
, right? Not sure which is better, but strictly speaking it can happen in either one ...
Filtering within gateway-<lnrpc>-extension
allows gatewayd
to be dumb about the HTLCs type definition, which vary per LN implementation. Additionally, since we already map HTLCs into some Fedimint gateway known types, we might as well pre-filter within the interceptor and only send relevant HTLCs to gatewayd
for processing, while immediately continuing [or canceling] irrelevant HTLCs
Filtering within gateway-
-extension allows gatewayd to be dumb about the HTLCs type definition, which vary per LN implementation
You could just have the subscription event include the short channel id. Wouldn't need to know anything else AFAIK.
If we used the same proto definitions as LND (and had the gateway filter HTLCs) would there be any need for the LND extension? Couldn't gatewayd
just connect directly to LND?
But having LND define this interface this might not work for very long. For instance, the extension will eventually need a subscription to intercept p2p messages requesting a bolt12 invoice from offer. I imagine CLN will support such a hook much sooner than LND will ...
Having a service run between LND and gatewayd seems like a lot of hassle (additional config) and another point of failure. It's bad enough we already need it for CLN.
So what if we had a trait in gatewayd that specifies the interface we need and then implement that trait both for the CLN protoc client and the LND protoc client? That way we'd avoid the additional service in the LND case. That rust trait impl could still do filtering etc.
Filtering within gateway--extension allows gatewayd to be dumb about the HTLCs type definition, which vary per LN implementation
You could just have the subscription event include the short channel id. Wouldn't need to know anything else AFAIK.
For CLN, you'd get HtlcAccepted
data on intercept, append short short channel id to this, and then send it over network to gatewayd. For LND, you'd get ForwardHtlcInterceptResponse
data on intercept, append short short channel id to this, and then send it over network to gatewayd.
HtlcAccepted
or ForwardHtlcInterceptResponse
types over network wire to gatewayd. Doesn't scale well across LN implementations.gatewayd
to understand HtlcAccepted
or ForwardHtlcInterceptResponse
or other type X defined by a specific node implementationsettle
, cancel
or continue
as the outcome of the round tripPre-filtering HTLCs within the gateway-<lnrpc>-extension
service fixes this :)
But having LND define this interface this might not work for very long. For instance, the extension will eventually need a subscription to intercept p2p messages requesting a bolt12 invoice from offer. I imagine CLN will support such a hook much sooner than LND will ...
Yeah it's best we as Fedimint own the rpc spec. We then should aim to stabilize it so any great implementation of the gateway-<lnrpc>-extension
can be relatively stable and evenually take on a slower upgrade cadence to gatewayd and the rest of Fedimint.
gateway-<lnrpc>-extension
Example, a node runner can opt to deploy their gateway-<lnrpc>-extension
implementation with extremely efficient htlc intercept and filter on behalf of federaions. Eclair experts can implement gateway-eclair-extension
independent of our work at improving gatewayd
, the contracts, features and security or providing gateway services
Having a service run between LND and gatewayd seems like a lot of hassle (additional config) and another point of failure.
Is this not the regular pain of microservice architecture? Except in this case we get tremendous benefits from having gatewayd
and gateway-<lnrpc>-extension
be two independent, interoperating microservices
It's bad enough we already need it for CLN.
I don't see any 'additional badness' for LND other than the pain we were already going to meet even with . The architecture of LND requires a service running parallel and outside of lnd
binary process.
So what if we had a trait in gatewayd that specifies the interface we need and then implement that trait both for the CLN protoc client and the LND protoc client?...
Not sure I understand this first part of the new suggestion
That way we'd avoid the additional service in the LND case. That rust trait impl could still do filtering etc.
and so the second part is lost to me as well...
I don't see any 'additional badness' for LND other than the pain we were already going to meet even with . The architecture of LND requires a service running parallel and outside of lnd binary process.
LND requires us to run 1 additional process. We're planning to run 2 additional processes (gatewayd and extension).
Not sure I understand this first part of the new suggestion
The suggestion is to have gatewayd
do direct RPC with CLN and LND. To accomplish this, have a trait (like our current LnRpc
) which we can impl for CLN and LND. Instead of doing it over the network.
I think I might have missed some context. Is the main purpose of this new extension process to filter HTLCs before forwarding to the gateway? And this is required for both CLN and LND right?
More generally, maybe we should
I don't see any 'additional badness' for LND other than the pain we were already going to meet even with . The architecture of LND requires a service running parallel and outside of lnd binary process.
LND requires us to run 1 additional process. We're planning to run 2 additional processes (gatewayd and extension).
Not sure I understand this first part of the new suggestion
The suggestion is to have
gatewayd
do direct RPC with CLN and LND. To accomplish this, have a trait (like our currentLnRpc
) which we can impl for CLN and LND. Instead of doing it over the network.
Okay, this is the design approach we were on before #1103 . I would not recommend designing LND intergration any different from CLN integration pattern
Formally, here are the two designs we have:
gatewayd
with embedded LnRpc
trait calling into a node
LnRpc
trait implementation for each LN implementation supportedgatewayd
to embed an appropriate LnRpc
trait implementationgatewayd
is the extension to a node.gatewayd
with unified rpc client calling into a gateway-<lnrpc>-extension
gateway-<lnrpc>-extension
for each LN implementation supportedgateway-<lnrpc>-extension
is the extension to a node.gatewayd
is a standalone service that calls into any such gateway-<lnrpc>-extension
I think I might have missed some context. Is the main purpose of this new extension process to filter HTLCs before forwarding to the gateway? And this is required for both CLN and LND right?
The main purpose is to make it so that the gateway is agnostic to the specific flavor on RPC it is speaking to the lightning node (e.g. which node implementation, which version of that implementation). The interface between the gateway and the extension would be defined by us, and all node implementation details would live in the extension. Filtering HTLCs in the extension microservice is an optimization. This approach would work without doing that.
The debate is really around where we translate operations fedimint gateway wants a lightning node to do -- pay invoice, get node pubkey, subscribe to HTLCs -- into lightning node-specific RPC requests -- JSON-RPC for CLN, gRPC for LND.
One option is to do it in a Rust trait in gatewayd
. It would look pretty similar to our existing LnRpc
trait, but would need a way to establish subscriptions.
The other option would be to do it in a microservice. This is more complex to deploy and makes the code a little harder to understand, but would have better separation of concerns (gatewayd
doesn't need to know anything about lightning node RPC interfaces) and more deployment flexibility.
We just did a call and decided to proceed with the microservice option.
Improve upon GatewayLightning
subscribe_intercept_htlcs
by defining a correspondingcomplete-htlc
client streaming rpc for responding to intercepted and processed htlcsFor every new federation connected, the gateway sends a
subscribe_intercept_htlcs
RPC to request for HTLC interception. This opens a stream over which intercepted HTLCs are sent to the client:For every HTLC intercepted and processed by the gateway:
complete_htlcs
RPC to send a request to settle the HTLC if HTLC processing was successfulcomplete_htlcs
RPC to send a request to cancel the HTLC if HTLC processing failed, timed-out or was rejected for some other reasonUse
complete_htlcs
RPC in batch mode