stacks-network / stacks-core

The Stacks blockchain implementation
https://docs.stacks.co
GNU General Public License v3.0
3.01k stars 664 forks source link

Mempool transaction propagation leads to slow-to-confirm transactions #2365

Closed kantai closed 3 years ago

kantai commented 3 years ago

Users and integrators have been reporting long-pending transactions (several hours to confirm) -- this may be a symptom of the lack of #2193, but it may be something else. More investigation is required to figure out the underlying issues, but the effect here is pretty clear: transactions (for seemingly random reasons) spend a long time in the "pending" state.

jcnelson commented 3 years ago

Do you have txids for such transactions? I can see if they made it to the Foundation's follower or miner mempools.

kantai commented 3 years ago

Not specifically, but you can see what the explorer thinks are pending transactions here:

https://stacks-node-api.stacks.co/extended/v1/tx/mempool?limit=200&offset=200

Which is sorted by arrival time

jcnelson commented 3 years ago

I checked the first five of those "pending" transactions in seed-0's logs. They were all rejected by the mempool. Errors reproduced below:

6a566ac26dbe0d7997247fef9f34f5f69f614d7023659d1db6dd2d7edbec2f8e:

520145218dceb11554de6dc3acea29e08079bb4e123e40c56b7f47b26a4faee8:

9165ba0511d737e29a4d8006f1c6dbff98cbb12376d9276742f8b08c3964816c:

93f4bd66d6505ddf35ca270e897be00150cf83ceea795bedc629ebfb269c2c5d:

f0fa682af7742c44e7ba2c78045ecf230985d6591a8b89a6aab3bc55504d4d59:

diwakergupta commented 3 years ago

Do you have txids for such transactions? I can see if they made it to the Foundation's follower or miner mempools.

@jcnelson here are two I could find:

0x8e4ac474364883f6fe5d4eadb720b4e5030bc1dd295ffb87fb7d79cdbbd9b169 from https://github.com/blockstack/explorer/issues/338. I can see in seed-0's logs that it was received and relayed. I could not find a corresponding rejection message. What does the "Did not handle" indicate?

convo:id=262,outbound=false,peer=18000000+00000001://161.97.89.169:36636: Received message Transaction(8e4ac474364883f6fe5d4eadb720b4e5030bc1dd295ffb87fb7d79cdbbd9b169), relayed by [RelayData { peer: 35fbbfdd2cf84725c61ccb41df271f8515d325b3://0.0.0.0:20444, seq: 0 }, RelayData { peer: 144ff0394aa650a15ac4a15c5f779099c8f28672://0.0.0.0:20444, seq: 0 }]

convo:id=262,outbound=false,peer=18000000+00000001://161.97.89.169:36636: Did not handle message (type Transaction(8e4ac474364883f6fe5d4eadb720b4e5030bc1dd295ffb87fb7d79cdbbd9b169) seq 191907726); passing upstream

0x95a2ae0051193b93dc08a984422c4ededd61c51124ef76dbbe3cf4bfbb693d75 from this forum post. Similar pattern -- no rejection in mempool message, did not handle and relayed:

local.1://(bind=35.199.6.155:20444)(pub=Some(172.25.9.48:20444)): Will forward message Transaction(95a2ae0051193b93dc08a984422c4ededd61c51124ef76dbbe3cf4bfbb693d75) from 18000000+00000001://47.90.29.31:36712 to relayer

convo:id=108,outbound=false,peer=18000000+00000001://47.90.29.31:36712: Did not handle message (type Transaction(95a2ae0051193b93dc08a984422c4ededd61c51124ef76dbbe3cf4bfbb693d75) seq 1143833861); passing upstream

convo:id=108,outbound=false,peer=18000000+00000001://47.90.29.31:36712: Received message Transaction(95a2ae0051193b93dc08a984422c4ededd61c51124ef76dbbe3cf4bfbb693d75), relayed by [RelayData { peer: 35fbbfdd2cf84725c61ccb41df271f8515d325b3://0.0.0.0:20444, seq: 0 }]
jcnelson commented 3 years ago

Okay, so Did not handle message ... only indicates that the ConversationP2P struct could not automatically reply to the incoming message, so it needs to pass it to the global P2P state-machine for subsequent processing. Normally, the ConversationP2P struct will handle ancillary responsibilities like handling Ping and Nack messages, and will pass data-plane messages "upstream" to the P2P state machine. The P2P state machine will include these data-plane message in its NetworkResult struct, which in turn will be passed to the relayer thread where the transactions will be stored and queued for broadcasting.

Looking at 8e4ac474364883f6fe5d4eadb720b4e5030bc1dd295ffb87fb7d79cdbbd9b169, I see a lot of the following messages in the logs:

Will forward message Transaction(8e4ac474364883f6fe5d4eadb720b4e5030bc1dd295ffb87fb7d79cdbbd9b169) from 18000000+00000001://24.193.118.186:51377 to relayer

I see this message at least a dozen times, for a dozen different IP addresses. I also see this transaction getting re-relayed -- I see several messages indicating that the node already has the transaction. So, there's a good chance this transaction is widely-replicated. In fact, I found the raw transaction in the Foundation's public follower node:

00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003700000000000000C80101C1484BEF1B6B3357F3999D322876201D81BA473F3BD4436CD5F7D0460319F5AE59C77047094C2F50D837A0E85FAE9B91677BB4632AEAC2CAEAFD34E163BF536F0302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD2700000000005E69AD831383637373731363235000000000000000000000000000000000000000000000000

It decodes to:

{"version":"Mainnet","chain_id":1,"auth":{"Standard":{"Singlesig":{"hash_mode":"P2PKH","signer":"e3d94a92b80d0aabe8ef1b50de84449cd61ad637","nonce":55,"fee_rate":200,"key_encoding":"Uncompressed","signature":"01c1484bef1b6b3357f3999d322876201d81ba473f3bd4436cd5f7d0460319f5ae59c77047094c2f50d837a0e85fae9b91677bb4632aeac2caeafd34e163bf536f"}}},"anchor_mode":"Any","post_condition_mode":"Deny","post_conditions":[],"payload":{"TokenTransfer":[{"Standard":[22,[58,141,0,206,67,229,113,141,157,71,183,132,177,134,106,91,157,60,210,112]]},98999000,"31383637373731363235000000000000000000000000000000000000000000000000"]}}

Extracting its public key hash (e3d94a92b80d0aabe8ef1b50de84449cd61ad637) and signing mode (P2PKH), the sender's address is SP3HXJJMJQ06GNAZ8XWDN1QM48JEDC6PP6W3YZPZJ, and its nonce is 55. I got this via the following:

$ node
> c32 = require('c32check')
> c32.c32address(22, "e3d94a92b80d0aabe8ef1b50de84449cd61ad637")
'SP3HXJJMJQ06GNAZ8XWDN1QM48JEDC6PP6W3YZPZJ'

(Note that 22 is the version byte for single-signature (i.e. P2PKH) mainnet addresses).

Looking up this address in the system, we have:

$ curl http://seed-0.mainnet.stacks.co:20443/v2/accounts/SP3HXJJMJQ06GNAZ8XWDN1QM48JEDC6PP6W3YZPZJ?proof=0
{"balance":"0x0000000000000000000004eb67926931","locked":"0x00000000000000000000000000000000","unlock_height":0,"nonce":48}

As you can see, the last transaction that got confirmed from this address had nonce 48. Something's fishy here -- where are transactions 49 through 54?

I notice that your other transaction -- 95a2ae0051193b93dc08a984422c4ededd61c51124ef76dbbe3cf4bfbb693d75 -- is also from SP3HXJJMJQ06GNAZ8XWDN1QM48JEDC6PP6W3YZPZJ, but with nonce 67 (?!?).

Looking again at the mempool, I see that the Foundation follower does indeed have all transactions from SP3HXJJMJQ06GNAZ8XWDN1QM48JEDC6PP6W3YZPZJ from nonces 49 through 68:

$ sqlite3 /path/to/mempool.db 'select origin_nonce,hex(tx) from mempool where origin_address = "SP3HXJJMJQ06GNAZ8XWDN1QM48JEDC6PP6W3YZPZJ" and origin_nonce > 48'
49|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003100000000000000C80101C3243BA61D4600FD1322B9883E79809847C94D983551D170E5EFFB009D81FB497E87AD1D4EE2627746AC6B575B2AF31A81D3C7B852F76E1ED4D0975164BCA6BF0302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD270000000000609651831383637363635313832000000000000000000000000000000000000000000000000
50|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003200000000000000C80100F480A2129F3BEA6E7526E74A0CE562BBAEE58D042147EE6944F200D4E316432E566EF12FF12F5263A6D5E54213835D9CBB316A5DB6492EF61ED2086DC65F7F420302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD2700000000001AF1B4831383637373632353936000000000000000000000000000000000000000000000000
51|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003300000000000000C801009759D4B268871E5EFD7B4DD8299AF5F9B0D333E23199490E1CD41F523373BFBC136E6BFFFBEB3E98D84C25C039789DEAF863E0CBB654E59AA5E724FEEDBF4C4803020000000000051678B1597796C17F3F6257B63CCF72287D203D9D790000000054C1CCB84F4B636F696E00000000000000000000000000000000000000000000000000000000
52|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003400000000000000C801007034FF4D833D3509C4C7ECC380C5D6C54D6ABB89F1DAECB04AF4914A8B8608296859831B8A750737EB2DE318DE3F8BCDA5F10A0915203FB1AEBE149759B754A70302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD2700000000011294F8031383636303337323831000000000000000000000000000000000000000000000000
53|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003500000000000000C80101F7C78193A9E6419F93BD3D48AE04028E078F4B3D0309235A2554010ECA73391C7914F92557E9A9D88AD93847610E5473C7F138E0949967B3134EC933BE9838380302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD27000000000038440D831383637363537393733000000000000000000000000000000000000000000000000
54|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003600000000000000C801014439F7D8C9FBDD8D485BC79CFF73F8957DA806318DA2DD593D14F4B1A23B6EEB75B80B7C6E61DEAA048CD97589F962EBD9FADC6E3C3B2E700E98C46BF44739AA0302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD2700000000005D7589831383637373632353936000000000000000000000000000000000000000000000000
55|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003700000000000000C80101C1484BEF1B6B3357F3999D322876201D81BA473F3BD4436CD5F7D0460319F5AE59C77047094C2F50D837A0E85FAE9B91677BB4632AEAC2CAEAFD34E163BF536F0302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD2700000000005E69AD831383637373731363235000000000000000000000000000000000000000000000000
56|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003800000000000000C80101504DC31D967C3BA3078F837FCB7D8388B067A0137C859C479C55FCF4E58B307364115F5E914C1AC4C1310E742604126DF8D2FA871EAFD57CB7DD355524F6B1B60302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD270000000003FED287831383636343735323533000000000000000000000000000000000000000000000000
57|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003900000000000000C801007B184EBBF738AB13F116707A7302D5FDFEF42305F19CE48E81E94E6C5687044D0C83586441D51E8BD2BA5A1CF183ED399B486C236D870BCB448593488C1706110302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD27000000000004C204831383636383631353030000000000000000000000000000000000000000000000000
58|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003A00000000000000C801000EA749DCBD037F3F39AF78367DBC05A1A4ADC3D3E34C6816A56143048E8C9D2C0771B1994C866871E7482BF2416D20FB8B56C2EF8381736792660EB3B1D105320302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD270000000000BEDCD7031383637373330303833000000000000000000000000000000000000000000000000
59|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003B00000000000000C80100B6526939699821A4C8D2663770F807B1E135D34E08CD107013D5EAAF99A364747707A68F6A671356E71438538D111C17E9F7F5A9A5A1063846E3D62914C0D59D030200000000000516408C6559B57C60ED86EE7B1EEEF969F26624562600000002540BE40054414D43470000000000000000000000000000000000000000000000000000000000
60|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003C00000000000000C801001BAF42505EC8B550421B25C8647901F470FCEB8F5C91ED8EB905C8697BEA29C41DB16D5948B3DD7C3E138E20B5E1A20EF64D00A80CDCFAE36E789A51F866964F0302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD27000000000068E601031383637363433343934000000000000000000000000000000000000000000000000
61|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003D00000000000000C80101A7C3B8D99755675C9543D529D7C261F93A76B5E0E84797F12B39E99089AA26201347E313A6BEF153524856C1185DC1F889F989A98BBF869D2B6E96B02F19DCDD030200000000000516408C6559B57C60ED86EE7B1EEEF969F26624562600000002540BE40054414D43470000000000000000000000000000000000000000000000000000000000
62|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003E00000000000000C801009DFCFA2E80FDE8FE36AFBE675667ACE1D764C43E2DC2AF76B3D2B296C7BBA7266223927BA1D70C7506917395793DDF94DBB0790B1173BF682B78A0FB2A346C490302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD27000000000063084B831383637373639333130000000000000000000000000000000000000000000000000
63|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000003F00000000000000C80101BA24F7909B9576DF93F0ABC6D0509C69895ED6FFD7363335F848AD684E61778E17774E5D53E24668D9A3209232BAED3F9F49517C07A9B18C50C90C56E28A744D0302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD27000000000036FF58831383637363737353630000000000000000000000000000000000000000000000000
64|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000004000000000000000C80100732D24EBFC65BAC975B82211041B55C66A1C39FEFD7A70C735574C362F8623CA3E04CD56E9E5ED7309C35DE56787A84D372DA362C2C9FFB5E5BFE6DECDBBF1E10302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD2700000000005DF6B0031383637373735313336000000000000000000000000000000000000000000000000
65|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000004100000000000000C8010010A2C30B4B4ECE617F16523B8D6616ED3DF00E5FE80B77446799AADB289A691B3B085D82B15C9A8AA6EC87BD5B8549C4C4531AA88C5F23286CD465949F2D19690302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD27000000000489ABB2031383637363339313037000000000000000000000000000000000000000000000000
66|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000004200000000000000C80101B2283FF4FD37F064DCD34B574D533C0B4EA40CFD725215E185BD705F15CFAD7057448628A06E8883468B38B8AE3BEA64087C46296B3AD545385C50C40722E65F0302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD2700000000029A5414031383637373830323134000000000000000000000000000000000000000000000000
67|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000004300000000000000C80101703553A5C1CD179AE5EF88ED45CAAFBA1E808C4562C1765C120F74293421E2A725991B3973665537FF8AF59592275CC478C4CDD570AF497BC4EFF4D12C95A0450302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD2700000000006DA41D831383637363439323630000000000000000000000000000000000000000000000000
68|00000000010400E3D94A92B80D0AABE8EF1B50DE84449CD61AD637000000000000004400000000000000C80101E6120652498648D261A9149DAA02C18C0BE0875E8CF6566258B64667EF4438550A2A19265A85062AAD1018C6FF8D3BAD3AD4E1D60D8A3EFD7E4542688BF27CBE0302000000000005163A8D00CE43E5718D9D47B784B1866A5B9D3CD27000000000005520F831383637373831303230000000000000000000000000000000000000000000000000

Moreover, the Foundation miner has all the same transactions. I'll bet that other nodes have them too.

I manually re-pushed these transactions to all the Foundation follower's neighbors. Let's see if this makes a difference.

jcnelson commented 3 years ago

Looks like it worked:

$ curl 'http://seed-0.mainnet.stacks.co:20443/v2/accounts/SP3HXJJMJQ06GNAZ8XWDN1QM48JEDC6PP6W3YZPZJ?proof=0'
{"balance":"0x0000000000000000000004e6c987c799","locked":"0x00000000000000000000000000000000","unlock_height":0,"nonce":69}
psq commented 3 years ago

SP3HXJJMJQ06GNAZ8XWDN1QM48JEDC6PP6W3YZPZJ is an OKCoin address, and earlier today, they had 2 transactions missing with nonce 47 and 48 (from memory, could be off by 1 or 2), with other transactions with higher nonces to be stalled (withdrawals, it appears from one discussion on discord)

wileyj commented 3 years ago

We're seeing a similar issue with the following:

txid : da6dc3287416ba14830cc3950b2a04db805aefb2db0c636d7e4304a3c3b894ad
hex : 00000000010400e026d205e7196475e1bff2eb87ae38fa61a54bfb000000000000000100000000000007d000001f666a7d9c08beb61cf1775d9b0ea524dce98ea152f7a534aafdcc3c3c63291d019420282401742ac2828f08ad2ba101fea699d713fe07bebeb68cdb95a2164c0302000000000005164fc1fe50e817aa2d093ad269bf916b816790ab0500000000000186a030000000000000000000000000000000000000000000000000000000000000000000

i've verified what @jcnelson mentioned in his reply for this tx, but i'm not sure how to decode or what else is involved to rebroadcast

jcnelson commented 3 years ago

We solved several issues related to this (e.g. mempool walking bugs), so I'm going to go ahead and close this. Let's reopen when we have more specific problems.