input-output-hk / jormungandr

privacy voting blockchain node
https://input-output-hk.github.io/jormungandr/
Apache License 2.0
364 stars 132 forks source link

A successfully produced block did not show up in the Shelley Testnet explorer, nor was it credited at the end of the epoch #1608

Open stakehodlers opened 4 years ago

stakehodlers commented 4 years ago

Describe the bug (Originally posted as a question to IOHK Support via their ticket system, and they redirected me to post on github.)

Not sure if this is expected behavior or a defect. I am a stakepool operator (ticker STHDL) running jormungandr v0.8.6 on ubuntu 18.04 LTS on a raspberry pi 4 model B, with 4GB RAM and a 1Gb/s pipe out to the Internet. The STHDL pool was scheduled to produce block 36.7056, which it did. Here are the leader logs shortly after minting the block that I was responsible for:


When I look at the Shelley Testnet Explorer, I see the following (note the absence of 36.7056, between 36.7107 and 36.7041):

36 7114 18:10:45, January 18, 2020 c87a...dd2 1 0.500000 12408.078664 36 7107 18:10:31, January 18, 2020 5189...2e6 0 0.000000 0.000000 36 7041 18:08:19, January 18, 2020 79ed...f9a 0 0.000000 0.000000 36 7029 18:07:55, January 18, 2020 fdbe...fb4 0 0.000000 0.000000 36 7021 18:07:39, January 18, 2020 26c1...0f7 2 1.600000 17437.154365 36 7014 18:07:25, January 18, 2020 607b...604 0 0.000000 0.000000 36 7001 18:06:59, January 18, 2020 5a1b...644 0 0.000000 0.000000 36 6978 18:06:13, January 18, 2020 967a...350 0 0.000000 0.000000

From 2-second slots, 36.7056 is exactly 30 seconds from 36.7041. 36.7041 was produced at 18:08:19+05:00 (=23:08:19 UTC), and 36.7056 was produced at 23:08:49.001531413, within 2ms of exactly 30 seconds after 36.7041. Likewise, the math for the time between 36.7107 and 36.7056 also checks out to within some small tolerance. Also, at the end of the epoch, no rewards were credited to the pool or to its stakers.

I am running chrony to synchronize time on the node, and when I observe several hours of node stats and immediately compare them against the latest block height advertised on pooltool.io, the STHDL node is either in lock-step with the pootool.io advertisement, or ahead of it.

Can you help me understand why this produced block is not reflected in the explorer and was not credited with rewards? The pool, after all, successfully produce the block it was assigned. Is there any additional documentation that you can point me (and others) to so that I can enhance my understanding of the details of block production; which blocks get rewarded; etc. The STHDL pool also produced several other blocks in earlier epochs that, for some reason, the pool never got credit for. I originally chalked this up to the earlier testnet instability, but the stability has been extremely good since the release of v0.8.6, so I don't believe that the network is/was a factor with the block at 36.7056.

If there is any additional technical detail I can attempt to supply, please let me know.

Mandatory Information

  1. jcli --full-version output; jcli Version jcli 0.8.6 (heads/v0.8.6-76863977, release, linux [aarch64]) - [rustc 1.40.0 (73528e339 2019-12-16)]
  2. jormungandr --full-version output; Jormungandr Version jormungandr 0.8.6 (heads/v0.8.6-76863977, release, linux [aarch64]) - [rustc 1.40.0 (73528e339 2019-12-16)]

To Reproduce Steps to reproduce the behavior:

  1. Run a jormungandr leader node
  2. Successfully produce a block
  3. See no evidence of block in Shelley testnet explorer
  4. See no rewards credited for a successfully produced block

Expected behavior

  1. Successfully produced blocks are reflected in Shelley testnet explorer
  2. Rewards are given for blocks produced.

Additional context Add any other context about the problem here.

stakehodlers commented 4 years ago

Q: if a block is scheduled to be created, is it automatically in the 10% of slots for the epoch that will become blocks? Assuming that the active slots coefficient is 0.1, then another way of asking this question is: will every slot scheduled to have a leader be considered an active slot? Or will some of the lead slots for a given epoch go unrewarded after successful block production as not being part of the collection of active slots for that given epoch?

Straightpool commented 4 years ago

I had the exact same issue today on my 2nd ever block successfully produced. From my understanding this is called a vaporized block, basically you had a slot which was a multi-leader slot, so more than one leader was assigned to this slot. The other leader one in the race condition. These types of blocks are rare.

It is my understanding this is just bad luck and by design. Keep trying is all we can do. Better error feedback would be nice no doubt.

stakehodlers commented 4 years ago

Another to add to the list... https://github.com/input-output-hk/jormungandr/issues/1596

MarcelKlammer commented 4 years ago

I had the exact same issue today on my 2nd ever block successfully produced. From my understanding this is called a vaporized block, basically you had a slot which was a multi-leader slot, so more than one leader was assigned to this slot. The other leader one in the race condition. These types of blocks are rare.

It is my understanding this is just bad luck and by design. Keep trying is all we can do. Better error feedback would be nice no doubt.

Actually that's not the case with most "totally in sync, block produced within 0.003 seconds of slot, cannot process produced block" blocks. Because the next block, that get's accepted by the network is most often in a later slot.

Current log observations:

Both bugs are annoying for smaller pools and my guess is, that once those two are fixed, we will see produced blocks per epoch go way up from 2800 to 3800 or even 4000.

Straightpool commented 4 years ago

Yes I could meanwhile rule out that I lost my 2nd block to a competitive fork, see https://github.com/input-output-hk/jormungandr/issues/1651