ExchangeUnion / xud

Exchange Union Daemon 🔁 ⚡️
https://exchangeunion.com
GNU Affero General Public License v3.0
115 stars 44 forks source link

ETH funds aren't recoverable after connext client crash #1728

Closed LePremierHomme closed 4 years ago

LePremierHomme commented 4 years ago

Background

In a normal swap scenario, once the taker claims his incoming payment and release the preimage, the maker can use it to claim his incoming payment as well. When simulating a situation where the maker's connext client crashes right before he claim his incoming payment, once re-started and got connected again, it seems that the funds aren't available, both on /hashlock-resolve (used by xud) and /hashlock-status endpoints.

xud

09/07/2020 15:50:11.873 [CONNEXT] trace: sending request to /hashlock-resolve: {"assetId":"0x0000000000000000000000000000000000000000","preImage":"0x22b29b56d5418921ae18bd7d7d8abc7a00c3f96d70cce889d50d0a60558ffa4c"}
09/07/2020 15:50:11.902 [CONNEXT] error: connext server error 500: Internal Server Error
09/07/2020 15:50:11.903 [SWAPS-RECOVERY] error: could not settle ETH invoice for payment e2fc35e7ab970b7bddba3c3f7a64a7fc22930dc7da526324c254cab0f2f04e4d: 8.5 - connext server error 500: Internal Server Error
09/07/2020 15:50:11.903 [SWAPS-RECOVERY] alert: incoming ETH payment with hash e2fc35e7ab970b7bddba3c3f7a64a7fc22930dc7da526324c254cab0f2f04e4d could not be settled with preimage 22b29b56d5418921ae18bd7d7d8abc7a00c3f96d70cce889d50d0a60558ffa4c, **funds may be lost and this must be investigated manually**

connext client

[1594309811901] ERROR (5183 on 6ea29479327a): Receiver app has not been installed
    Error: Receiver app has not been installed
        at ResolveTransferController.<anonymous> (/connext-vol/rest-api-client/node_modules/@connext/client/dist/controllers/ResolveTransferController.js:50:31)
        at Generator.next (<anonymous>)
        at fulfilled (/connext-vol/rest-api-client/node_modules/@connext/client/dist/controllers/ResolveTransferController.js:5:58)
[1594309811902] INFO  (5183 on 6ea29479327a): request completed
    reqId: 31
    url: "/hashlock-resolve"
    statusCode: 500

Steps to reproduce

testMakerConnextClientCrashedBeforeSettlement test is implemented in the connext_instability branch. All other instability tests are disabled.

To run locally:

$ npm run test:sim:clean:xud
$ npm run test:sim:clean:custom-xud
$ npm run test:sim:clean:connext
$ npm run test:sim:build:xud
$ npm run test:sim:build:connext
$ npm run test:sim:build:test
$ npm run test:sim:run:instability

Latest CI run:

https://github.com/ExchangeUnion/xud/runs/855419939

Expected behaviour

ETH funds should be recovered.

Actual behaviour

Connext client doesn't recognize the swap hash.

ghost commented 4 years ago

@moshababo debugging this further it looks like Receiver app has not been installed error is working as intended. Somehow the connext client that got the receiver app installed is no the same client we're trying to unlock/resolve the payment from.

When running this locally I'm getting 2 log files for connext-client-Alice which seem to be initialized with a different seed. Notice the difference in the client's public key (indra7o39EX... vs indra7ygY4K...)

connext-client-Alice-ABC

2020-07-14T17:53:40.142Z [NodeApiClient] Node responded to indra7o39EXpKMiuVFQcvMkU3C6JGwZSK9omg12xHUXDPW7xPLxBHSV.indra8AXWmo3dFpK1drnjeWPyi9KTy9Fy3SkCydWx8waQrxhnW4KPmR.channel.get request in 55 ms

connext-client-Alice-CBA

2020-07-14T17:54:16.522Z [NodeApiClient] Node responded to indra7ygY4KgTSKGKePdR7XUH2SaCvNxto2uX8ydrWmb5EZn2yA8nx5.indra8AXWmo3dFpK1drnjeWPyi9KTy9Fy3SkCydWx8waQrxhnW4KPmR.client.check-in request in 7 ms
LePremierHomme commented 4 years ago

What's the mechanism for the seed initialization? Isn't it via xud calls (/connect, /mnemonic)?

ghost commented 4 years ago

What's the mechanism for the seed initialization? Isn't it via xud calls (/connect, /mnemonic)?

That's correct. It is /mnemonic -> /connect. Perhaps we can print out the seed being sent to the rest-api-client in the tests?

ghost commented 4 years ago

Looking at both Alice's xud.logs (before and after crash) also indicates that connext client is initialized with a different seed.

14/07/2020 18:43:50.824 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"become phrase soup chief caught cart card desk sibling excite width code vacant diesel seek mango aunt enjoy crane fragile quote bronze above next"}

and

14/07/2020 18:43:55.082 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"repair sound lawsuit recall include actual supreme dad direct issue account glad total error syrup moral lady jazz exit sister spend fragile grid verb"}
ghost commented 4 years ago

The code that derives connext seed from xud's private key is here: https://github.com/ExchangeUnion/xud/blob/master/lib/nodekey/NodeKey.ts#L125

LePremierHomme commented 4 years ago

Looking at both Alice's xud.logs (before and after crash) also indicates that connext client is initialized with a different seed.

Are you sure about this? Looking in both my original local run, and on latest CI run, it looks like the same seed is being used.

From CI:

09/07/2020 20:11:28.988 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"wide lawsuit battle voice pave try camera lab radar priority school debate license private wrestle plug grit genuine elephant asthma six impulse dish aerobic"}
.
.
09/07/2020 20:11:48.075 [SWAPS] info: CUSTOM_SCENARIO: INSTABILITY::MAKER_CLIENT_CRASHED_BEFORE_SETTLE
09/07/2020 20:11:48.076 [SWAPS] info: CLIENT_TYPE ConnextClient, CLIENT_PID: 3042
.
.
09/07/2020 20:11:49.080 [SWAPS] error: could not settle invoice for deal 6b3784611518b6ebc1430f1c9a1354460fc4005d3a9908552da3bff10b8a48c9: Error: connect ECONNREFUSED 0.0.0.0:45671
09/07/2020 20:11:49.080 [SWAPS] alert: incoming ETH payment with hash 6b3784611518b6ebc1430f1c9a1354460fc4005d3a9908552da3bff10b8a48c9 could not be settled with preimage 90595150be5c7423db71bd881fd9a1b2147d1ec236afc6333f195bec883bdb6f, this is not expected and funds may be at risk
.
.
09/07/2020 20:11:58.989 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"wide lawsuit battle voice pave try camera lab radar priority school debate license private wrestle plug grit genuine elephant asthma six impulse dish aerobic"}
.
.
09/07/2020 20:12:03.988 [SWAPS-RECOVERY] alert: incoming ETH payment with hash 6b3784611518b6ebc1430f1c9a1354460fc4005d3a9908552da3bff10b8a48c9 could not be settled with preimage 90595150be5c7423db71bd881fd9a1b2147d1ec236afc6333f195bec883bdb6f, **funds may be lost and this must be investigated manually**

And even if there's a mismatch in the seed, isn't that a problem with xud behaviour?

ghost commented 4 years ago

Looking at both Alice's xud.logs (before and after crash) also indicates that connext client is initialized with a different seed.

Are you sure about this? Looking in both my original local run, and on latest CI run, it looks like the same seed is being used.

From CI:

09/07/2020 20:11:28.988 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"wide lawsuit battle voice pave try camera lab radar priority school debate license private wrestle plug grit genuine elephant asthma six impulse dish aerobic"}
.
.
09/07/2020 20:11:48.075 [SWAPS] info: CUSTOM_SCENARIO: INSTABILITY::MAKER_CLIENT_CRASHED_BEFORE_SETTLE
09/07/2020 20:11:48.076 [SWAPS] info: CLIENT_TYPE ConnextClient, CLIENT_PID: 3042
.
.
09/07/2020 20:11:49.080 [SWAPS] error: could not settle invoice for deal 6b3784611518b6ebc1430f1c9a1354460fc4005d3a9908552da3bff10b8a48c9: Error: connect ECONNREFUSED 0.0.0.0:45671
09/07/2020 20:11:49.080 [SWAPS] alert: incoming ETH payment with hash 6b3784611518b6ebc1430f1c9a1354460fc4005d3a9908552da3bff10b8a48c9 could not be settled with preimage 90595150be5c7423db71bd881fd9a1b2147d1ec236afc6333f195bec883bdb6f, this is not expected and funds may be at risk
.
.
09/07/2020 20:11:58.989 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"wide lawsuit battle voice pave try camera lab radar priority school debate license private wrestle plug grit genuine elephant asthma six impulse dish aerobic"}
.
.
09/07/2020 20:12:03.988 [SWAPS-RECOVERY] alert: incoming ETH payment with hash 6b3784611518b6ebc1430f1c9a1354460fc4005d3a9908552da3bff10b8a48c9 could not be settled with preimage 90595150be5c7423db71bd881fd9a1b2147d1ec236afc6333f195bec883bdb6f, **funds may be lost and this must be investigated manually**

And even if there's a mismatch in the seed, isn't that a problem with xud behaviour?

Hmm, you're right. The seed from xud seems correct on CI. However, even on CI the connext client logs indicate different indra pubkey when installing the receiver app vs when trying to resolve the conditional transfer. This is super confusing. I'll take another look soon.

ghost commented 4 years ago

I rebased connext_instability branch against master. Latest run: https://github.com/ExchangeUnion/xud/runs/872498799

All of the xud runs for Alice had a different private key (please note that not all of them are related to the instability test):

2020-07-15T08:24:48.5513085Z xud-Alice-1594801346.log
2020-07-15T08:24:48.5559742Z 15/07/2020 08:23:20.926 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"media festival staff insect drop spider pause universe improve domain mutual unfair labor truth seek hire dawn mass analyst rail alien tree stadium blue"}

2020-07-15T08:36:22.4933552Z xud-Alice-1594801706.log
2020-07-15T08:36:22.4978738Z 15/07/2020 08:29:09.024 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"december blanket volume scare deliver aim verb napkin tragic pave pioneer arrive dune mango ice angle vacant grape wonder payment avocado mutual nation version"}

2020-07-15T08:36:22.5375936Z xud-Alice-1594802059.log
2020-07-15T08:36:22.5439524Z 15/07/2020 08:34:20.328 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"fashion trial glance afraid silk name benefit loud antique state reason volume mountain drop error snake clarify engage thrive orphan olive outer cart scheme"}

2020-07-15T08:36:22.5537966Z xud-Alice-1594801569.log
2020-07-15T08:36:22.5570972Z 15/07/2020 08:27:00.537 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"goose satoshi dumb fun eight globe deer asset raise twin when domain brand poem spin small slight online brand outer guide mix winner maple"}

2020-07-15T08:36:22.5601336Z xud-Alice-1594801879.log
2020-07-15T08:36:22.5635509Z 15/07/2020 08:31:20.262 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"measure render frozen evidence true parade have six frog food oblige maple dove decorate lizard seat pitch afford ridge ivory mammal stay inspire chapter"}

2020-07-15T08:36:22.5774951Z xud-Alice-1594801651.log
2020-07-15T08:36:22.5807057Z 15/07/2020 08:27:32.776 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"coil divide obvious answer maple predict addict pride magnet ozone obey advance mass ability fall ordinary giraffe junk ankle company lounge exist rebuild cool"}

2020-07-15T08:36:22.6003372Z xud-Alice-1594801698.log
2020-07-15T08:36:22.6034928Z 15/07/2020 08:28:20.206 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"alley enough rally early wife empty boil traffic empty tennis attract spray ethics despair spread like traffic soap sock caution edit confirm act strong"}

2020-07-15T08:36:22.6644918Z xud-Alice-1594801754.log
2020-07-15T08:36:22.6674273Z 15/07/2020 08:29:15.228 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"weekend warfare obscure million evolve hat vacant weasel coast notable spot thing custom sound hurt unveil exotic fetch life please siege people goat flag"}

2020-07-15T08:36:22.6746530Z xud-Alice-1594801633.log
2020-07-15T08:36:22.6776547Z 15/07/2020 08:27:15.033 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"brother captain depend armor hero oxygen father bless giggle task sausage sample fly blue special bulb ship actual any candy frog fix farm basic"}

2020-07-15T08:39:52.2768188Z xud-Alice-1594802260.log
2020-07-15T08:39:52.2817649Z 15/07/2020 08:38:24.993 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"unusual quarter also evil deputy magic walk arrow matrix mind attitude happy lonely raw pill quarter indoor arctic coral blood duck pass impact start"}

2020-07-15T08:39:52.3177703Z xud-Alice-1594802311.log
2020-07-15T08:39:52.3210686Z 15/07/2020 08:38:32.524 [CONNEXT] trace: sending request to /mnemonic: {"mnemonic":"wisdom warrior swarm equip crowd blush square swift sort region scare bullet cinnamon deal talent eight pond amused margin napkin document own basket prevent"}

All of the xud logs started with:

2020-07-15T08:39:52.3178582Z 15/07/2020 08:38:32.067 [RPC] debug: Could not load gRPC TLS certificate. Generating new one

which also suggests that the private key was also created. How is the data directory of xuds handled within the test suite?

LePremierHomme commented 4 years ago

On every network launch (btcd, ltcd, lnd-btc nodes, lnd-ltc nodes, connect-client nodes, xud nodes), each node instance gets it's own datadir, which get deleted on the network teardown. This happens also when we're switching xud with the custom-xud version. So what you described is the expected behaviour.

In the case of the test scenario, the xud node instance lives throughout the connext-client crash/re-start, so it's expected to have the same private key and to use the same mnemonic.

LePremierHomme commented 4 years ago

The problem is due to the re-init of the same connext client with different seeds during its lifetime, even if the secondary init happened before the HTLC was activated. It's currently not supported by connext client.