Closing a ledger channel too early silently fails

lalexgap commented 1 year ago

Currently if we attempt to start a DirectDefund objective but there are running guarantees in the ledger channel we return an error.

However we swallow that error in the engine. This means if a someone calls client.CloseLedgerChannel too early the objective silently fails. To the caller of client.CloseLedgerChannel it looks like the API succeeded (since the API returns an objective id) but in reality the objective doesn't even get created!

lalexgap commented 1 year ago

This is similar to https://github.com/statechannels/go-nitro/issues/744 where we receive a message to start a new objective before we're ready. In this particular case we also swallowed the error so it wasn't apparent that the objective had failed.

geoknee commented 1 year ago

What do you think is the correct behaviour of go-nitro when the "user" requests to directly defund a "non-empty" ledger channel?

Here's one suggestion:

Option 1

Add a new waiting point to the start of directdefund called WaitingForCleanOutcome or WaitingForEmptyOutcome etc.
Do not return an error when constructing a direct defund objective under this condition. Instead, construct it and let it just hit that first pause point.
Ensure no new guarantees are added into the ledger channel (perhaps install a policy to reject any requests to do so)
Ensure that when the events arrive which will cause the ledger channel to empty out, the directdefund objective is cranked so that it can make progress.

This way, we can still detect inactivity on the directdefund objective and use that to trigger some on-chain recovery operations when we support that.

Option 2

Stick with the existing behaviour but surface the error instead of swallowing it. Then the "user" needs to keep trying until they successfully spawn the objective.

This means more boilerplate for the consuming application to write, and I think this option is probably just avoiding the bigger problem.

lalexgap commented 1 year ago

What do you think is the correct behaviour of go-nitro when the "user" requests to directly defund a "non-empty" ledger channel?

For the short term (and maybe medium term?) solution I think the client reporting an error is sufficient enough. I think the client would only run into this if they're attempting to close a ledger channel before they're done using it so it's easy to avoid. In the testground test we weren't properly waiting for virtual defund objectives so that explains why we ran into this.

Longer term I think something like your option 1 makes sense. It would be nice to apply this to virtual defunding as well to address #744. My preference would be create a new issue to implement option 1 and tackle that later.

Ensure no new guarantees are added into the ledger channel (perhaps install a policy to reject any requests to do so)

I wonder the best way to tackle this. Maybe something similar to channel ownership we already do in the store.

geoknee commented 1 year ago

Decision in standup today:

For now we will generate an error when this happens (we don't expect it often but we can't rule it out).

We will save the longer -term solutions for a future time.

statechannels / go-nitro

Closing a ledger channel too early silently fails #883

Option 1