JayDDee / cpuminer-opt

Optimized multi algo CPU miner
Other
765 stars 543 forks source link

Stale shares & Resolved blocks not confirmed by pool #347

Closed bonifacio123 closed 2 years ago

bonifacio123 commented 2 years ago

Hello, I'm not sure if this is a pool issue or something else. Running on an Intel i7-11700K. Seeing lots of stale shares and also seeing blocks solved that the pool isn't showing as solved.

Startup parameters

cpuminer-avx512-sha-vaes.exe -t 14 -a scryptn2

[2021-11-14 08:48:41] Scrypt paramaters: N= 1048576, R= 1
[2021-11-14 08:48:41] Throughput 16/thr, Buffer 384 MiB/thr, Total 5376 MiB

CPU: 11th Gen Intel(R) Core(TM) i7-11700K @ 3.60GHz
SW built on Nov 10 2021 with GCC 9.3.0
CPU features:  AVX512 VAES SHA
SW features:   AVX512 VAES SHA
Algo features: AVX512

Starting miner with AVX512...

Here's an example of a resolved block that the pool didn't see. I don't see is a 6427 Submitted log entry:

[2021-11-14 08:38:13] 6427 Submitted Diff 0.00045304, Block 1262909, Job 2a5
[2021-11-14 08:38:13] 6426 Accepted 6367 S59 R0 B6, 15.080 sec (1241ms)
[2021-11-14 08:38:13] scrypt: scryptn2.na.mine.zergpool.com:3435
                      Periodic Report     5m05s        23h04m
                      Share rate        4.32/min     4.64/min
                      Hash rate         87.08h/s     87.21h/s   (83.55h/s)
                      Lost hash rate     8.71h/s       0.96h/s
                      Submitted            22         6427
                      Accepted             20         6367       99.1%
                      Stale                 2           59        0.9%
                      Blocks Solved         0            6
                      Stratum errors                     1
                      Hi/Lo Share Diff  0.00081001 /  3.1233e-007
                      Count mismatch: 1, stats may be inaccurate
[2021-11-14 08:38:15] 6427 A6368 S59 R0 BLOCK SOLVED 7, 0.371 sec (2189ms)
                      Diff 0.00045304, Block 1262909, Job 2a5
[2021-11-14 08:38:23] New Block 1262910, Net diff 0.00040271, Job 2a6
                      Diff: Net 0.00040271, Stratum 0.020316, Target 3.1e-007
                      TTF @ 91.66 h/s: Block 5h14m, Share 0m14s
                      Net hash rate (est) 28.83 kh/s
[2021-11-14 08:38:31] 6428 Submitted Diff 9.4162e-007, Block 1262910, Job 2a6
[2021-11-14 08:38:32] 6428 Accepted 6369 S59 R0 B7, 18.033 sec (1546ms)

An example of stale shares - is this a share from a previous block?

[2021-11-14 00:24:52] 3711 Accepted 3677 S34 R0 B3, 5.020 sec (1211ms)
[2021-11-14 00:25:09] 3712 Submitted Diff 2.669e-007, Block 1262418, Job ec0
[2021-11-14 00:25:09] New Work: Block 1262418, Net diff 0.0003921, Job ec1
[2021-11-14 00:25:09] 3712 Submitted share pending, maybe stale
[2021-11-14 00:25:09] 3712 A3677 Stale 35 R0 B3, 17.321 sec (394ms)
                      Diff 2.669e-007, Block 1262418, Job ec0

What might cause a loss of hash rate?

 Lost hash rate     8.71h/s       0.96h/s

Thank you

JayDDee commented 2 years ago

The stale shares are real, you can tell by the sequence of logs, new work was received after the share was submitted but before it was replied. That's just bad luck but high latency can increase stales. It was stale because it was an old job, note the job id.

The block solved issue is more complicated. The share diff is shown to be higher than the net diff but not by much. It could be a math precision issue. In stratum the 256 bit hash target is calculated from an encoded 32 bit value called nbits. This results in imprecise hash targets.

There's no way to verify without knowing the actual hash target used by the pool to calculate nbits for stratum then converted back to a hash target by the miner.

You can add -D to get more data including the actual share hash and the target used to verify it but I don't see a lot of value since we already know the target is imprecise.

With enough samples of blocks solved you can quantify the error.

The same issue can occur with simple shares. Imprecise targetting could result in false positives (low dif reject) or false negatives (valid share silently discarded). These can be audited by checking for low diff rejects and the lowest accepted share diff which should converge to the target diff over time.

Edit: The lost hash rate is because more shares were submitted than accepted during the sample period.

JayDDee commented 2 years ago

I suggest changing the title to something more specific. It will help for others searching the issues.

JayDDee commented 2 years ago

I'm making a change to how net diff is calculated to use long double (float80) instead of double (float64). This should reduce accumulated error from serial rounding. The FP unit uses 80 bits internallly but rounds to 64 bits for double. Using long double for all variables used in the calculation retains the full 80 bits with one rounding at the end. The net result is, hopefully, increased precision with the target calculation.

bonifacio123 commented 2 years ago

Thank you

JayDDee commented 2 years ago

I'm not sure it's a precision issue. I submitted a share in the same pool with a diff 0f .0006 with a net diff of .0004 that wasn't recognized as solving a block by the pool.

It could be a pool issue or specific to that algo. It' s going to be difficult to solve because it isn't easy to reproduce. I'm watching for any BLOCKS SOLVED to collect as much info as possible such as the share diff, block height, net_diff, and info from the pool about the block. Maybe a pattern will emerge in time. So far .0007 was accepted as a block. The net diff has gone up so that may make a difference also.

It's easy to spot from the miner, Anytime a block is solved it should be immediately followed by a new block. If not the pool didn't consider the block solved.

JayDDee commented 2 years ago

That share was way higher than the net diff, too much to be a precision error. I'm beginning to suspect something with the pool or coin. At Zergpool/XBTX the diff is often shown as zero, If you look at the coin history the diff of the solution doesn't look right. I noticed the zero diff at Zpool too so maybe its an issue with the coin.

I'm not going to get too excited about it unless it's reported with another coin.

  Bitcoin Subsidium (scryptn2) 0 XBTX 25.228 1 266 245 1m ago   New
  Bitcoin Subsidium (scryptn2) 5 XBTX 0 1 266 244 3m ago   Immature (1/110)
  Bitcoin Subsidium (scryptn2) 5 XBTX 0 1 266 243 5m ago   Immature (1/110)
  Bitcoin Subsidium (scryptn2) 5 XBTX 0 1 266 239 7m ago party Immature (1/110)
  Bitcoin Subsidium (scryptn2) 5 XBTX 0 1 266 238 8m ago party Immature (1/110)
  Bitcoin Subsidium (scryptn2) 5 XBTX 0 1 266 236 11m ago   Immature (1/110)
  Bitcoin Subsidium (scryptn2) 5.001 XBTX 0 1 266 235 12m ago   Immature (2/110)
  Bitcoin Subsidium (scryptn2) 5 XBTX
  Bitcoin Subsidium (XBTX) 2m ago 1266245         25.228322 34.262528 3LhtD8Y...  
  Bitcoin Subsidium (XBTX) 4m ago 1266244 5.00031049   226 % Immature (1/110) 0.000385717527 101.546072 M8C1jQR... 91441f79632faa4b325887ed861cbe7dab6edde46e71ef30b507bda0221b94b1
  Bitcoin Subsidium (XBTX) 6m ago 1266243 5   353 % Immature (1/110) 0.000387084039 26.196269 17cUGVb... 3d28949eb1c943f416d3ca17366d7da970da0135eecf4f061e8e9177d069372a
  Bitcoin Subsidium (XBTX) 8m ago 1266239 5 party 49 % Immature (1/110) 0.000385236915 54.178924 3DMhZ2X... 497f01d8a60f514ce86ab68f5eb7178f4e11fa8719d3a59744f0162d8f1e0893
  Bitcoin Subsidium (XBTX) 9m ago 1266238 5 party 111 % Immature (1/110) 0.000381265992 344.571335 bc1qttu... c4e3c6cd842c27873989c22b936e711c6b28d36350e21523fd112f20923973a0
  Bitcoin Subsidium (XBTX) 12m ago 1266236 5   189 % Immature (1/110) 0.000378147042 51.997509 ltc1qjx... 507c5148fc6442d75b87723cc33fae406fec0aa043a9617e250d9ed60bbdbad7
  Bitcoin Subsidium (XBTX) 13m ago 1266235 5.00092043   200 % Immature (2/110) 0.000380489641 30.844153 bc1qqet... 6bf7d7615486543417bb5333f79f43993ddf29ec943b05f003cf054c3ae9f0e6

 

JayDDee commented 2 years ago

The underlying question is whether miners are getting cheated for solved blocks. This can be tested by comparing the block TTF with the actual rate blocks are found. The longer the time the more statistically valid the result. If there is a negative bias we may be losing blocks.

bonifacio123 commented 2 years ago

Thanks for looking at this JayDDee - I will close this issue.