Open kernelkind opened 2 months ago
We should be able to override this. We sometimes needed to leave+rejoin as a quick fix to some bug. Since we keep the client directory around there is little risk for loss of funds.
Before merging this we should make sure that gateway-cli leave-federation
is updated to support this change. Might be good to have coverage for this command in devimint tests.
Might be good to have coverage for this command in devimint tests.
It would be good to also add a rust test for this.
Another idea: what happens if gateway only has dust? I think we should still be able to leave, otherwise they'd need to deposit and withdraw to get back to zero balance?
I've got a bunch of changes to add for this MR including a devimint test and backend test, but I've got a question before I push any changes.
I see that if I were to throw an error in handle_leave_federation
, the client has no way of receiving it, since the router panics on error. I need to somehow let the client know that the request to leave the federation was denied (because the client didn't use the --forced
flag & has remaining funds).
There are a couple ways I'm thinking of making changes. Here's the first:
async fn leave_fed(
Extension(mut gateway): Extension<Gateway>,
Json(payload): Json<LeaveFedPayload>,
) -> Result<impl IntoResponse, GatewayError> {
let fed = gateway.handle_leave_federation(payload).await;
Ok(Json(json!(fed.ok())))
}
Return an Option<FederationInfo>
instead of returning FederationInfo
. This way, if the client sees that the result is None
, it knows the federation didn't leave successfully. This isn't a great implementation in my opinion, the client doesn't see what error was returned.
Another, could be:
In gateway/ln-gateway/src/rpc/rpc_server.rs
:
async fn leave_fed(
Extension(mut gateway): Extension<Gateway>,
Json(payload): Json<LeaveFedPayload>,
) -> Result<impl IntoResponse, GatewayError> {
let fed = gateway.handle_leave_federation(payload).await;
let res = match fed {
Ok(fed) => FederationLeaveResult { info: Some(fed), error: None },
Err(e) => FederationLeaveResult { info: None, error: Some(e.to_string()) },
};
Ok(Json(json!(res)))
}
and in gateway/ln-gateway/src/rpc/mod.rs
:
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub struct FederationLeaveResult {
pub info: Option<crate::FederationInfo>,
pub error: Option<String>,
}
Let me know your thoughts on how the server can inform the client that leaving was unsuccessful.
I see that if I were to throw an error in handle_leave_federation, the client has no way of receiving it, since the router panics on error.
I'm confused, where are you seeing this? The gateway should not panic on any error, if it does, that's a bug.
GatewayError
is mapped to a StatusCode
in IntoResponse
impl IntoResponse for GatewayError {
fn into_response(self) -> Response {
// For privacy reasons, we do not return too many details about the failure of
// the request back to the client to prevent malicious clients from
// deducing state about the gateway/lightning node.
let (error_message, status_code) = match self {
GatewayError::OutgoingPaymentError(_) => (
"Error while paying lightning invoice. Outgoing contract will be refunded."
.to_string(),
StatusCode::BAD_REQUEST,
),
GatewayError::Disconnected => (
"The gateway is disconnected from the Lightning Node".to_string(),
StatusCode::NOT_FOUND,
),
_ => (
"An internal gateway error occurred".to_string(),
StatusCode::INTERNAL_SERVER_ERROR,
),
};
let mut err = Cow::<'static, str>::Owned(error_message).into_response();
*err.status_mut() = status_code;
err
}
}
You should be able to just return a GatewayError
and check for that on the client side.
I'm confused, where are you seeing this? The gateway should not panic on any error, if it does, that's a bug.
Ah I was mistaken. I saw this line: https://github.com/fedimint/fedimint/blob/348c4d3a7a56c2e89236383664d609e8f3ef3399/gateway/ln-gateway/src/rpc/rpc_server.rs#L70
And the implementation of the route method:
pub fn route(mut self, path: &str, method_router: MethodRouter<S, B>) -> Self {
panic_on_err!(self.path_router.route(path, method_router));
self
}
and assumed that was causing the panicking in my tests. I see now that's not the case
I can remove the scripts/tests/backend-leave-fed-test.sh
and scripts/tests/gateway-leave-fed-test.sh
, I just used them for easier testing
Another idea: what happens if gateway only has dust? I think we should still be able to leave, otherwise they'd need to deposit and withdraw to get back to zero balance?
In this situation, the gateway would run the command gateway-cli leave-fed --federation-id X --forced
I think you need to make the forced
flag optional in the CLI, otherwise the backwards compatibility tests will fail. Otherwise its looking good to me.
I made it optional, but the backwards compatibility test still fails. It's weird, why would it fail before supplying the --forced=true
flag? I would understand if it failed because the test was supplying the forced flag to an older version of the gateway-cli that didn't support it
Looks like the endpoint is returning null instead of FederationInfo
like expected.
00:00:11 Error: error decoding response body: invalid type: null, expected struct FederationInfo at line 1 column 4
00:00:11
00:00:11 Caused by:
00:00:11 invalid type: null, expected struct FederationInfo at line 1 column 4
The logs in CI only show the output of the devimint test, but not the logs of each daemon (like the gateway). I'd recommend running the test locally or trying to repro locally. Something is probably wrong in handle_leave_federation
A less elegant solution that I think wouldn't break the backwards compatibility tests is to revert the LeaveFed
command and create a new one called LeaveFedForced
. And LeaveFedPayload
would be reverted and a new payload LeaveFedForcedPayload
would also be created. Would this be acceptable?
Can you try to make force_leave
in LeaveFedPayload
an Option<bool>
instead of just bool
? I believe Deserialize
will default to None if no value is present in the serialized struct, which should fix the backwards compatibility issues. A value of None
can just be interpreted as false.
This method in v0.2.1
returns Result<()>
instead of Result<FederationInfo>
:
https://github.com/fedimint/fedimint/blob/a8422b84102ab5fc768307215d5b20d807143f27/gateway/ln-gateway/src/lib.rs#L874-L885
so in the test gateway_leave_fed_test FM: current CLI: current GW: v0.2.1
, the v0.2.1
rpc_server for gateway sends Ok(())
https://github.com/fedimint/fedimint/blob/a8422b84102ab5fc768307215d5b20d807143f27/gateway/ln-gateway/src/rpc/rpc_server.rs#L143-L149
But current rpc_client
leave_federation
expects a GatewayRpcResult<FederationInfo>
, not GatewayRpcResult<()>
:
So now the error makes sense:
00:00:15 invalid type: null, expected struct FederationInfo at line 1 column 4
How should I proceed?
@okjodom added versioning for the gateway API awhile back. @okjodom could you advise on how to modify leave_federation
so it is backwards compatible?
@okjodom added versioning for the gateway API awhile back. @okjodom could you advise on how to modify
leave_federation
so it is backwards compatible?
I suppose we can cherry pick https://github.com/fedimint/fedimint/pull/4275/commits/4aa64957d9faf0fe204f7d36cd3f44a0d4151284 to v0.2.1
so the whole test matrix is compatible. Otherwise we should only consider v0.2.2
and later, in the tests covering this api
[EDIT] adding more context, the commit above breaks leave_federation
API in order to test it. The tests are independent of the broader changes in the PR this commit was part of. Ref: https://github.com/fedimint/fedimint/pull/4275
If we wanted to use v0.2.2
it looks like 4aa6495 would need to be chery-picked here as well since it doesn't have the Result<FederationInfo>
return type for handle_leave_federation
etc. It doesn't matter to me which we use, but it would be nice to get this PR across the finish line at some point
If we wanted to use
v0.2.2
it looks like 4aa6495 would need to be chery-picked here as well
Not worth it, let's only run this for current and future versions, it's a new feature.
Turns out I should have read the backlog, but I think my fix should get this unblocked.
@kernelkind I don't think the test failure is due to your code, we just didn't test leaving the federation at all earlier and apparently broke compatibility in #4275: before the PR the API returned ()
while after the PR it returned FederationInfo
. Luckily, just decoding it as Option<T>
should solve the problem.
When a gateway sends the command to leave a federation without the forced flag, it will receive an error
GatewayError::UnexpectedState
. When a gateway sends the command to leave the federation with the forced flag, it will be able to leave and no error will be returned.Closes: https://github.com/fedimint/fedimint/issues/3935