nanocurrency / nano-node

Nano is digital currency. Its ticker is: XNO and its currency symbol is: Ӿ
https://nano.org
BSD 3-Clause "New" or "Revised" License
3.49k stars 787 forks source link

Delays before pending blocks show up #3382

Closed ApolloNano closed 3 years ago

ApolloNano commented 3 years ago

Summary

On my node I sometimes get pending blocks very delayed. I couldn't pinpoint what exactly causes it.

Sometimes it takes 8-10 hours before a confirmed pending block shows up for the accounts I hold for users. On another site pickupnano.com using the same node I can see a delay of 5-10 minutes quite often.

Sync status says 100%. 291 Peers are connected. https://apollonano.com/nodemonitor/ The current block compared to the height of the checked blocks on nanocrawler is about a hundred behind. Sometimes I saw it around ~6-8k behind.

I moved everything to the internal hard drive instead of an attached volume, turned off voting and restarted the node but neither appear to have had an impact on the delays.

Sending transactions from the node appear without any issues instantly.

I also noticed newer pending blocks would show up before older ones that wouldn't show up until 8 hours later. That was after restarting the node though, not sure the same would have happened if I didn't restart the node.

Node version

V22.1

Build details

Using the docker image. RocksDB is enabled.

Config changes made but probably unrelated to the problem. work_threads = 0 work_peers = ["xx:4545"]

OS and version

Ubuntu 20.04.2 LTS

Steps to reproduce the behavior

  1. Send a transaction to an account
  2. Check with RPC "pending"
  3. ...

Expected behavior

Someone sends NANO to an account and the pending block show up right away on the node.

Actual behavior

Pending blocks don't show up.

All I get is an empty result for the account: { "blocks": "" }

Possible solution

Problem may not show up if I were using LMDB instead of RocksDB. Not sure about that though. I have not switched back yet to test it.

Supporting files

Log file

zhyatt commented 3 years ago

@ApolloNano Are there specific accounts that you see repeated issues with? Or is it more widespread across a bunch of accounts?

ApolloNano commented 3 years ago

The receiving accounts can be new or older ones without any transactions yet. I also tried different sending accounts. A delay appears in each case.

zhyatt commented 3 years ago

Have you tried running the node with a different ledger? That was unclear from your description.

qwahzi commented 3 years ago

Not sure if my own issue is related, but I figured I'd post it just in case (RocksDB, non-PR, experimental pruning, CentOS 8, docker node). @ApolloNano - Do you happen to have experimental pruning enabled too?

NanoTicker median conf block count: ~123,839,620 My block count: "123,839,608":

# curl -d '{"action": "block_count"}' 127.0.0.1:7076
{
    "count": "123839608",
    "unchecked": "8693",
    "cemented": "123839563",
    "full": "25794554",
    "pruned": "98045054"
}

My recent test transactions (sending to the faucet) seem to be working ok: https://mynano.ninja/block/0502F7E822E80F874653126BA528F7BA79F32D01B4F78F8D5D31E23FD2FE4CE0 https://mynano.ninja/block/B48B735455124867556E76E443BFC91C49BA33636E5CB34B697BBB49B40EF9D5

But here is my node's pending transaction list (even though none of the block explorers show my faucet address as having any pending transactions, and block_info shows these blocks as confirmed or pruned):

{
    "blocks":
        [
        "720CA9D5C47B6D9987B72051A658F1057997F2E3B7AA9C56FADEE6D61EFFF16C",
        "C11065472E8ABE1DE2172B90FBC29CBA33EF8719FA692E12C5F7B6DBD9455C76",
        "D200B0074D8B6F6925CD641E4CB05D1DF087EEA87775703659FB5D5EF3C62094"
        ]
}

My first two test transactions from this evening were automatically received without ever showing up in pending (or were so quick I missed them), but the above list of pending blocks just sits there until I manually receive them? Block_info shows two of them as already confirmed, and one as missing (due to pruning?). Maybe some kind of race condition between the pruning, pending, and auto-receiving logic?

Block_info for the 3 pending blocks:

curl -d '{"action": "block_info", "json_block":"true", "hash": "720CA9D5C47B6D9987B72051A658F1057997F2E3B7AA9C56FADEE6D61EFFF16C"}' 127.0.0.1:7076
{
    "block_account": "nano_1xtkneddhn179breektrk54tb19m6nxdk6m96hru5nqyyk4dz7e8qwy8mk5r",
    "balance": "47999989990000000000000000",
    "height": "6",
    "local_timestamp": "1624978255",
    "confirmed": "true",
    "contents": {
        "type": "state",
        "account": "nano_1xtkneddhn179breektrk54tb19m6nxdk6m96hru5nqyyk4dz7e8qwy8mk5r",
        "previous": "C11065472E8ABE1DE2172B90FBC29CBA33EF8719FA692E12C5F7B6DBD9455C76",
        "representative": "nano_1stofnrxuz3cai7ze75o174bpm7scwj9jn3nxsn8ntzg784jf1gzn1jjdkou",
        "balance": "47999989990000000000000000",
        "link": "CB96B762A73BF920DB1794A053C8B4EE0A0F74FCD07CCB89018C258A15C2CC47",
        "link_as_account": "nano_3kwppxjcggzs65fjh771ch6dbuic3xthsn5wsg6i5537jacw7m493ra8574x",
        "signature": "503234C43EA98A221293DEDBDF3B80F2186041D67D76DC4169B74824CBC3141103E8AFBAD646EA212486EB10A01122F6883EFA5C3141E72AB0928D56BC15D309",
        "work": "18dc4bd74d4bd346"
    },
    "subtype": "send"
}
curl -d '{"action": "block_info", "json_block":"true", "hash": "C11065472E8ABE1DE2172B90FBC29CBA33EF8719FA692E12C5F7B6DBD9455C76"}' 127.0.0.1:7076
{
    "error": "Block not found"
}
curl -d '{"action": "block_info", "json_block":"true", "hash": "D200B0074D8B6F6925CD641E4CB05D1DF087EEA87775703659FB5D5EF3C62094"}' 127.0.0.1:7076
{
    "block_account": "nano_3pcutj718gk4q1htjrg6oqemyewcjnzs1padt1muj4eadfihimibne5pumti",
    "balance": "0",
    "height": "60",
    "local_timestamp": "1623837169",
    "confirmed": "true",
    "contents": {
        "type": "state",
        "account": "nano_3pcutj718gk4q1htjrg6oqemyewcjnzs1padt1muj4eadfihimibne5pumti",
        "previous": "C1F427E89762419995B9DA161AAC5BFAB5A2138C3455C7DC536B52E288351AF3",
        "representative": "nano_3power3gwb43rs7u9ky3rsjp6fojftejceexfkf845sfczyue4q3r1hfpr3o",
        "balance": "0",
        "link": "CB96B762A73BF920DB1794A053C8B4EE0A0F74FCD07CCB89018C258A15C2CC47",
        "link_as_account": "nano_3kwppxjcggzs65fjh771ch6dbuic3xthsn5wsg6i5537jacw7m493ra8574x",
        "signature": "E82DFC7EFD654B4EC135438885CA49AD66874CB2F7B223B6F5CA90FA22C173527822FF514B5117E4BDABFEC9D7C784FBB1A1CCC356C5D31BBA4D7E2E3924C80E",
        "work": "d839cc39abfeb517"
    },
    "subtype": "send"
}

EDIT:

Did a 3rd transaction, which still didn't show up in my pending list (or it was so brief I missed it), but it did show up in MyNanoNinja's pending list for the faucet account, and local block_info showed it as unconfirmed until it was auto-received a few seconds later:

# curl -d '{"action": "block_info", "hash": "45F5947C3FD3C16766E541C1442963D7113E9CC04295BE0169D81D326434825C"}' 127.0.0.1:7076
{
    "block_account": "nano_1f64xpnc3y3c43f61aoi439o7uceugmgm9kgdukxzmffy6niz6katyp4bky9",
    "amount": "10000000000000000000000000000",
    "balance": "100199467334588940789972980306980",
    "height": "641",
    "local_timestamp": "1626921097",
    "confirmed": "false",
    "contents": "{\n    \"type\": \"state\",\n    \"account\": \"nano_1f64xpnc3y3c43f61aoi439o7uceugmgm9kgdukxzmffy6niz6katyp4bky9\",\n    \"previous\": \"B48B735455124867556E76E443BFC91C49BA33636E5CB34B697BBB49B40EF9D5\",\n    \"representative\": \"nano_3ekb6tp8ixtkibimyygepgkwckzhds9basxd5zfue4efjnxaan77gsnanick\",\n    \"balance\": \"100199467334588940789972980306980\",\n    \"link\": \"CB96B762A73BF920DB1794A053C8B4EE0A0F74FCD07CCB89018C258A15C2CC47\",\n    \"link_as_account\": \"nano_3kwppxjcggzs65fjh771ch6dbuic3xthsn5wsg6i5537jacw7m493ra8574x\",\n    \"signature\": \"2BA69DA8BDBEA72969F1CA92CE8DCCE5530C8278A6E2779139A2F6967D3C61E8AE33ED96F6ED3956CDB69F708A02A39BA66573BEC6A589D27CE0ED8E4DA5A00A\",\n    \"work\": \"f042ea60504cddd1\"\n}\n",
    "subtype": "send"
}
# curl -d '{ "action": "pending", "account": "nano_3kwppxjcggzs65fjh771ch6dbuic3xthsn5wsg6i5537jacw7m493ra8574x" }' 127.0.0.1:11338

{
    "blocks":
    [
    "720CA9D5C47B6D9987B72051A658F1057997F2E3B7AA9C56FADEE6D61EFFF16C",
    "C11065472E8ABE1DE2172B90FBC29CBA33EF8719FA692E12C5F7B6DBD9455C76",
    "D200B0074D8B6F6925CD641E4CB05D1DF087EEA87775703659FB5D5EF3C62094"
    ]
}
# curl -d '{ "action": "block_info", "hash": "45F5947C3FD3C16766E541C1442963D7113E9CC04295BE0169D81D326434825C"}' 127.0.0.1:7076
{
    "block_account": "nano_1f64xpnc3y3c43f61aoi439o7uceugmgm9kgdukxzmffy6niz6katyp4bky9",
    "amount": "10000000000000000000000000000",
    "balance": "100199467334588940789972980306980",
    "height": "641",
    "local_timestamp": "1626921097",
    "confirmed": "true",
    "contents": "{\n    \"type\": \"state\",\n    \"account\": \"nano_1f64xpnc3y3c43f61aoi439o7uceugmgm9kgdukxzmffy6niz6katyp4bky9\",\n    \"previous\": \"B48B735455124867556E76E443BFC91C49BA33636E5CB34B697BBB49B40EF9D5\",\n    \"representative\": \"nano_3ekb6tp8ixtkibimyygepgkwckzhds9basxd5zfue4efjnxaan77gsnanick\",\n    \"balance\": \"100199467334588940789972980306980\",\n    \"link\": \"CB96B762A73BF920DB1794A053C8B4EE0A0F74FCD07CCB89018C258A15C2CC47\",\n    \"link_as_account\": \"nano_3kwppxjcggzs65fjh771ch6dbuic3xthsn5wsg6i5537jacw7m493ra8574x\",\n    \"signature\": \"2BA69DA8BDBEA72969F1CA92CE8DCCE5530C8278A6E2779139A2F6967D3C61E8AE33ED96F6ED3956CDB69F708A02A39BA66573BEC6A589D27CE0ED8E4DA5A00A\",\n    \"work\": \"f042ea60504cddd1\"\n}\n",
    "subtype": "send"
}
# curl -d '{ "action": "pending", "account": "nano_3kwppxjcggzs65fjh771ch6dbuic3xthsn5wsg6i5537jacw7m493ra8574x" }' 127.0.0.1:11338
{"blocks":
    [
    "720CA9D5C47B6D9987B72051A658F1057997F2E3B7AA9C56FADEE6D61EFFF16C",
    "C11065472E8ABE1DE2172B90FBC29CBA33EF8719FA692E12C5F7B6DBD9455C76",
    "D200B0074D8B6F6925CD641E4CB05D1DF087EEA87775703659FB5D5EF3C62094"
    ]
}
ApolloNano commented 3 years ago

@zhyatt Thanks. I'll give a different ledger a try. @qwahzi pruning is not enabled on the node

qwahzi commented 3 years ago

Also just saw this PR from Colin. Maybe related?

https://github.com/nanocurrency/nano-node/pull/3387

clemahieu commented 3 years ago

My change is only related to empty results and how they're represented in json.

zhyatt commented 3 years ago

@ApolloNano Any luck with further testing on this?

ApolloNano commented 3 years ago

I have downloaded a new ledger snapshot from mynanoninja and kept it as LMDB.

The ledger got synced up in about an hour and got pretty much to the last block count of nanolooker but the result appear to be the same.

Turned off every service that would have put any extra load on the node and just had the node run on it's own.

The node is still staying about 1k blocks behind what the latest block is on nanolooker. Not sure if that's the problem.

Edit: I did a couple of transactions from about 6 different accounts at the same time and I'm getting mixed results. Some were instant, some took about a minute others would not show up until maybe 30 minutes. Used 0.001 for the test. Some of the the instant ones arrived with others still going through a delay.

zhyatt commented 3 years ago

Another similar issue was reported in Discord here. The case is: they send a block from a wallet (not their node) which is immediately seen on other nodes on the network, but it takes 3 minutes for their node to see the send block in account_history, then it takes another 2-3 minutes before the send block is confirmed on their node (at which point it will show in the pending RPC with include_only_confirmed option). This doesn't happen 100% of the time but apparently a majority of the time.

qwahzi commented 3 years ago

Good find! I'm guessing the new issue you just opened is related to this? Or something separate?

https://github.com/nanocurrency/nano-node/issues/3419

zhyatt commented 3 years ago

Good find! I'm guessing the new issue you just opened is related to this? Or something separate?

3419

We believe there is a good chance that is related, although some additional testing and validation must still be done to be sure.

playnano commented 3 years ago

In the meantime, is there any workaround one can do to get PRs or other nodes to broadcast those missing transactions?

I've known about this issue for a while (mostly from user reports, using PlayNANO roulette, complaining that their deposits took too long to get accepted) but only now had the time to dig further into it. Made a few tests and pretty much the same is happening to me. Just had a transaction sent from Natrium take about 9 minutes to get discovered on my node.

Maybe my workaround/solution is to actually be a PR? x)

zhyatt commented 3 years ago

In the meantime, is there any workaround one can do to get PRs or other nodes to broadcast those missing transactions?

I've known about this issue for a while (mostly from user reports, using PlayNANO roulette, complaining that their deposits took too long to get accepted) but only now had the time to dig further into it. Made a few tests and pretty much the same is happening to me. Just had a transaction sent from Natrium take about 9 minutes to get discovered on my node.

Maybe my workaround/solution is to actually be a PR? x)

We have begun temporarily republishing blocks in an attempt to improve the situation until a fix can be pushed out. Are things looking better on your end?

playnano commented 3 years ago

Hi Zach, that's a great idea, thank you for helping :)

However, it didn't fix the issue. I tried sending one transaction from Natrium and it was pretty much instant, then the next two transactions took both 2+ minutes, and both were detected pretty much at the same time by my node (although one was sent one minute after the other). I tried one last transaction from a different address and it also took 2+ minutes to get delivered.

How often are you republishing those blocks?

What if I use the bootstrap_any RPC command every few seconds? Would that do anything or not really since my node is not behind?

zhyatt commented 3 years ago

@playnano Thanks for reporting. So it sounds like you are still experiencing delays with block propagation as well as confirmations. The blocks are republished pretty much immediately. The bootstrap_any isn't recommended, it likely wouldn't make a difference here. We are continuing to investigate this.

playnano commented 3 years ago

So it sounds like you are still experiencing delays with block propagation as well as confirmations.

I don't think the issue is with confirmations. The last time I checked (before making my 1st comment here, 14d ago) my node was confirming blocks as soon as they were discovered. The problem was that it was taking too long to discover the blocks. I didn't check this yesterday, but I'm assuming it is still the same problem, block propagation/discovery, not confirmations.

zhyatt commented 3 years ago

3507 has resolved block propagation issues previously reported. It is available with the V23.0DB3 build on beta and will be included in the V23.0 release.