JayDDee / cpuminer-opt

Optimized multi algo CPU miner
Other
779 stars 549 forks source link

AVX2 version of "anime" algo produces rejects #236

Closed YetAnotherRussian closed 4 years ago

YetAnotherRussian commented 4 years ago

[2020-02-05 10:51:00] New stratum diff 0.01, block 6205750, job b9cf anime: miningbase.tk:3033 Diff: Net 0.778, Stratum 0.01, Target 0.01 [2020-02-05 10:51:00] 8 miner threads started, using 'anime' algorithm. [2020-02-05 10:51:10] 1 submitted by thread 0, lane 1, job b9cf [2020-02-05 10:51:10] 1 Accepted 1 S0 R0 B0, 9.493 sec (3ms) Diff 0.0178 (2.29), Block 6205750, Job b9cf [2020-02-05 10:51:20] 2 submitted by thread 5, lane 0, job b9cf [2020-02-05 10:51:20] 2 A1 S0 Rejected 1 B0, 10.634 sec (4ms) Diff 0.0882 (11.3), Block 6205750, Job b9cf [2020-02-05 10:51:20] Reject reason: Low difficulty share Share diff: 0.0882113, Hash: 0000000b5614178400000000... Target diff: 2.46075e-316, Targ: ... [2020-02-05 10:51:27] New job b9d0

After an hour I got these stats on the pool side: Reject* 17.8%

It's rather high percentage - could be banned soon. My tests show that non-avx builds work properly. I've also checked linux builds (mine) and rejects are there, too. Btw, AVX2 adds around x1.5 hashrate improvement...

As I see, these are not "stale" shares (as they could be due to a very short block times).

Test data (wallet is not mine - test one provided by pool):

-a anime -o stratum+tcp://miningbase.tk:3033 -u AJ7q7fXheNfnt89bw4XQJekqUFaBoVf3c9 -p c=ANI

JayDDee commented 4 years ago

Here's a bit about low difficulty shares. You aren't losing anything you're just submitting shares that you should have discarded. The only affects are the noise and the risk of the pool black listing you.

https://bitcointalk.org/index.php?topic=1326803.msg53705296#msg53705296

Is this a new problem or just newly discovered?

It should be simple to fix given that lesser architectures work. There must be a targetting mismatch between the 2 architectures' vesrsions.

This line bothers me:

Target diff: 2.46075e-316, Targ: ...

Is that a copy/paste issue or is the target diff really 2.46075e-316?

You can verify the low difficulty by comparing the share diff with the target diff for the block. A good share should have a higher (or equal) share diff than the target diff.

After re-reading my onw notes I remembered the connection with the stratum diff set by the pool. Please note the stratum diff when you get intermittant low-diff and when you get 100% valid. It may not be the CPU architecture making the difference, it may be a different stratum diff.

JayDDee commented 4 years ago

Actually this is excellent news. Anime does not use the new aggressive targetting I discussed in BCT, The AVX2 targetting and share verification code is the old version on both targets.

It means the low-diffidulty shares are not caused by my code change and are most likely a pool issue. Is this pool yiimp based? The only times I've seen theproblem is on yiimp based pools, although most of my mining is also on yiimp based pools.

Edit: you can also confirm it's a pool issue by trying a different miner. TPruvot doesn't support Anime but I think I got the code here... https://github.com/Animecointeam/Anime-CPUMiner

YetAnotherRussian commented 4 years ago

@JayDDee ofc I've tested this issue on non-avx2 build of cpuminer-opt. You can easily "Ctrl+F" on the attached file to find "rejected" - no results. stats.txt

It's a (rather) large period of time logged in that file without rejects. And if I start the avx2 build just right now (a few minutes after the end of the log), I see:

rej_new

Oops, 2 of 11. Same settings, "-t 8", nothing else changed in bat file.

I can provide more tests if I could identify what to test more :D

Pool is a Yiimp one, but no issues with 2 other miners (CUDA & OpenCL).

Update: testing the generic miner (which is very slow btw): minerd

Will report the results later.

JayDDee commented 4 years ago

It's looking like the issue I've seen.

Pay close attention to the stratum diff changes. If you get low-diff rejects try setting a different stratum diff. You may find the low-diff always occur with a specific stratum diff as described in the BCT link.

Since it's working with Cuda and OCL it's likely a miner issue and not a pool issue. There must be a bug in my share validation, but as I said I' rather err with submitting low-diff than discard potentially valid shares. Fixing this will require a lot of care.

I've seen it on 3 algos, anime makes 4, and each has its own poisoned stratum diff, with all other working fine.

It's been an interesting day, two bizarre problems.

Something else you can do is check if the hashrate reported at the pool is reduced when you are submitting low-diff shares.

JayDDee commented 4 years ago

I have a new release almost ready to go so this issue will probably have to wait until the next one.

JayDDee commented 4 years ago

The next release will be out shortly and will contain some extra data getjering to help with this issue.

Whe a low difficulty share is reported the debug code will be activated. On subsequrent share submissions (all share submissions, we don't know ahead of time which shares mightl be rejected) the actual hash and target values used in the submission will be displayed. The value is essentially the reciprocal of the diff so hash value should be less than the target.

This will provide a full trace audit of the process.

The block report will display the target diff. When a share is submitted the submitted hash and the target tested against are displayed. The hash should be less than the target and the target should match the target diff in the block report.

When a reply is received rejecting the share the hash and target values will be displayed again and should match what was submitted. The share and target diffs are also displayed for added points of verification.

Everything sghould match up but something obvioyusly won't either test passed when the share hash was clearly high than tthe target, or the target changed or the share changed.

Whati s importatnt is what mismatched and by how much.

The math for converting a 32 byte hash or target value to the corresponding diff is a simple reciprocal but it gets tricky with 256 bit numbers on a 64 bit machine.

The converter uses double precision with should give 52 precise bits before error creeps in. This error can cause false negative and false positive results when the hash and share values are very close.

This is my area of focus with agressive submission. I want to ensure that close share results in it being submitted in case it's a false ngative. If it isn't, it will be rejected but that's better than being too conservative trying to avoid any low diff rejects.

I will verify the math but if the rejects are close to passing I don't think I'll try to fix it. If close fails are being submitted it might mean it's also submitting close valid shares that woiuld have failed a more stringent test and been discarded.

All the above ignores the association with a specific stratum difficulty. This will likely be important in ebentually understanding the root cause.

YetAnotherRussian commented 4 years ago

As of that generic version, it has been workin for around 10 hours with not a single reject.

Something else you can do is check if the hashrate reported at the pool is reduced when you are submitting low-diff shares.

I see hashrate drop (charts) and a separate table cell on the front-end web side with % of rejects.

YetAnotherRussian commented 4 years ago

Upgraded to v3.12.0

cpuminer-zen -a anime -t 8 -o stratum+tcp://miningbase.tk:3033 -u ... -p c=ANI

     **********  cpuminer-opt 3.12.0  ***********
 A CPU miner with multi algo support and optimized for CPUs
 with AVX512, SHA and VAES extensions.
 BTC donation address: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT

CPU: AMD Ryzen 9 3900X 12-Core Processor . SW built on Feb 5 2020 with GCC 7.3.0. CPU features: AVX2 AES SHA SW features: AVX2 AES SHA Algo features: AVX2 AES

Starting miner with AVX2 AES...

[2020-02-06 10:47:13] 24 CPU cores available, 8 miner threads selected. [2020-02-06 10:47:13] Extranonce subscribe: YES [2020-02-06 10:47:13] Stratum connect miningbase.tk:3033 [2020-02-06 10:47:13] Stratum connection established [2020-02-06 10:47:13] New stratum diff 0.01, block 6208619, job c70c anime: miningbase.tk:3033 Diff: Net 0.82, Stratum 0.01, Target 0.01 [2020-02-06 10:47:13] 8 miner threads started, using 'anime' algorithm. [2020-02-06 10:47:14] 1 submitted by thread 4, lane 2, job c70c [2020-02-06 10:47:14] 1 A0 S0 Rejected 1 B0, 1.912 sec (3ms) Diff 0.0119 (1.45), Block 6208619, Job c70c [2020-02-06 10:47:14] Reject reason: Low difficulty share Share diff: 0.0119158, Hash: 00000053ebcdbc6900000000... Target diff: 2.44133e-316, Targ: ... [2020-02-06 10:47:16] 2 submitted by thread 0, lane 0, job c70c [2020-02-06 10:47:16] 2 A0 S0 Rejected 2 B0, 1.757 sec (4ms) Diff 0.011 (1.35), Block 6208619, Job c70c [2020-02-06 10:47:16] Reject reason: Low difficulty share Share diff: 0.0110245, Hash: 0000005ab4b7e7ee00000000... Target diff: 2.44133e-316, Targ: ... [2020-02-06 10:47:17] New block 6208620, job c70d anime: miningbase.tk:3033 Diff: Net 0.82, Stratum 0.01, Target 0.01 TTF @ 2666.56 kh/s: block 22m00s, share 0m16s Net TTF @ 880.10 Mh/s: 0m04s [2020-02-06 10:47:22] 3 submitted by thread 2, lane 1, job c70d [2020-02-06 10:47:22] 3 A0 S0 Rejected 3 B0, 5.867 sec (3ms) Diff 0.0148 (1.8), Block 6208620, Job c70d [2020-02-06 10:47:22] Reject reason: Low difficulty share Share diff: 0.0147805, Hash: 00000043a7e71ddb00000000... Target diff: 2.44133e-316, Targ: ... [2020-02-06 10:47:22] New block 6208621, job c70e anime: miningbase.tk:3033 Diff: Net 0.82, Stratum 0.01, Target 0.01 TTF @ 2506.81 kh/s: block 23m24s, share 0m17s Net TTF @ 880.10 Mh/s: 0m04s

As I see: [2020-02-06 10:57:34] 1 A0 S0 Rejected 1 B0, 0.213 sec (4ms) Diff 0.0107 (1.3), Block 6208637, Job c722 [2020-02-06 10:57:34] Reject reason: Low difficulty share Share diff: 0.0106814, Hash: 0000005d9ea30bf600000000... Target diff: 2.46076e-316, Targ: ...

UPDATE: cpuminer-aes-sse42.exe -a anime -t 8 -o stratum+tcp://miningbase.tk:3033 -u ... -p c=ANI ... Starting miner with SSE2 AES... [2020-02-06 10:58:00] 24 CPU cores available, 20 miner threads selected. [2020-02-06 10:58:00] Extranonce subscribe: YES [2020-02-06 10:58:00] Stratum connect miningbase.tk:3033 [2020-02-06 10:58:00] Stratum connection established [2020-02-06 10:58:00] 20 miner threads started, using 'anime' algorithm. [2020-02-06 10:58:02] New stratum diff 0.01, block 6208638, job c724 anime: miningbase.tk:3033 Diff: Net 0.82, Stratum 0.01, Target 0.01 [2020-02-06 10:58:31] 1 submitted by thread 14, job c724 [2020-02-06 10:58:31] Hash[7:0}: 00000057 f83b29c6 5fb6dc14 c928e396 4176acfb a2f36462 3c7d79da 5f693ac9 [2020-02-06 10:58:31] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 10:58:31] 1 Accepted 1 S0 R0 B0, 31.115 sec (3ms) Diff 0.0114 (1.39), Block 6208638, Job c724 [2020-02-06 10:58:41] New block 6208639, job c725

Why AVX2 builds do not log this? Seems that avx2 versions (with "lanes") use another code.

[2020-02-06 11:05:38] New job c736 [2020-02-06 11:05:50] 1 submitted by thread 12, lane 1, job c736 [2020-02-06 11:05:50] 1 Accepted 1 S0 R0 B0, 20.960 sec (4ms) Diff 0.0153 (1.9), Block 6208654, Job c736 [2020-02-06 11:05:54] New block 6208655, job c737 anime: miningbase.tk:3033 Diff: Net 0.804, Stratum 0.01, Target 0.01 TTF @ 3691.99 kh/s: block 15m35s, share 0m11s Net TTF @ 138.17 Mh/s: 0m25s [2020-02-06 11:05:56] 2 submitted by thread 17, lane 0, job c737 [2020-02-06 11:05:56] 2 A1 S0 Rejected 1 B0, 6.379 sec (3ms) Diff 0.011 (1.37), Block 6208655, Job c737 [2020-02-06 11:05:56] Reject reason: Low difficulty share Share diff: 0.0110193, Hash: 0000005abfa501cd00000000... Target diff: 2.44457e-316, Targ: ... [2020-02-06 11:05:59] 3 submitted by thread 17, lane 3, job c737 [2020-02-06 11:05:59] 3 Accepted 2 S0 R1 B0, 3.297 sec (4ms) Diff 0.0106 (1.31), Block 6208655, Job c737

I see no debug info on accepted shares which could be seen with sse42 build.

UPDATE2: avx2 build hangs sometimes right after first submitted share without console output and CPU load... Started 10 times, two times had a hang.

JayDDee commented 4 years ago

I messed up the debug logs. First it was only implemented for single llane hashing, second it was left enabled by default instead of triggered by the first lowdiff share.

But that is only a distraction and has nothing to do with the problem. Don't focus on the different builds, stick with one (one that produces proper logs) and focus on the stratum diff. The stratum difficulry is critical but you never mentioned it once.

I only see stratum diff = 0.01 in your logs. Do you get rejects with different stratum?

I also need to see both the target and hash from the submit and the reject for the same share. There isn't one single instance in your post.

But I did notice somethuing odd, no target value displayed in th ereject reason log:

[2020-02-06 11:05:56] Reject reason: Low difficulty share Share diff: 0.0110193, Hash: 0000005abfa501cd00000000... Target diff: 2.44457e-316, Targ: ...

I don't know if it's significant because the submit logs are missing so I can't compare.

JayDDee commented 4 years ago

Maybe a more formal test procedure woiuld help.

  1. For v3.12.0 use the aes-avx build to get the debug logs

  2. Do a normal run and observe when low diff shares are produced intermittantly vs when they are not and the stratum diff at the time.

  3. When shares are rejected note the submit hash and target and compare with the reject hash and target for inconsistencies.

  4. Repeat the test with different stratum difficulty to confirm the problem does or does not occur with specific stratum diff settings.

Edit: other than the the log snafu the share testing for anime still uses the old method for all targets. There should be no difference between builds except for the hash rate and the debug logs.

Your note about a hang with AVX2 may be an anomaly. Keep an eye on that issue but don't let it distract from the main issue.

I repeat, the stratum diff is critical. I determines what the target is. I need the stratum diff, the hash and target at the time the share was submitted and the hash and target when it was rejected.

YetAnotherRussian commented 4 years ago

I only see stratum diff = 0.01 in your logs. Do you get rejects with different stratum?

There is no different one sadly. Diff control params are not enabled on the pool side through passing user login data.

I cannot send aes-avx build error logs as there are no rejects with it...

[2020-02-06 18:23:12] 24 CPU cores available, 12 miner threads selected. [2020-02-06 18:23:12] Extranonce subscribe: YES [2020-02-06 18:23:12] Stratum connect miningbase.tk:3033 [2020-02-06 18:23:12] 12 miner threads started, using 'anime' algorithm. [2020-02-06 18:23:12] Stratum connection established [2020-02-06 18:23:12] New stratum diff 0.01, block 6209525, job cb40 anime: miningbase.tk:3033 Diff: Net 0.769, Stratum 0.01, Target 0.01 [2020-02-06 18:23:23] 1 submitted by thread 5, lane 2, job cb40 [2020-02-06 18:23:23] 1 A0 S0 Rejected 1 B0, 11.454 sec (4ms) Diff 0.0431 (5.6), Block 6209525, Job cb40 [2020-02-06 18:23:23] Reject reason: Low difficulty share Share diff: 0.0430519, Hash: 000000173a38354600000000... Target diff: 2.44133e-316, Targ: ... [2020-02-06 18:23:27] 2 submitted by thread 4, lane 1, job cb40 [2020-02-06 18:23:27] 2 Accepted 1 S0 R1 B0, 4.135 sec (4ms) Diff 0.0108 (1.41), Block 6209525, Job cb40 [2020-02-06 18:23:41] New job cb41 [2020-02-06 18:23:45] 3 submitted by thread 0, lane 3, job cb41 [2020-02-06 18:23:45] 3 A1 S0 Rejected 2 B0, 17.437 sec (4ms) Diff 0.011 (1.43), Block 6209525, Job cb41 [2020-02-06 18:23:45] Reject reason: Low difficulty share Share diff: 0.0109879, Hash: 0000005b0210865e00000000... Target diff: 2.44133e-316, Targ: ... [2020-02-06 18:23:48] New block 6209526, job cb42 anime: miningbase.tk:3033 Diff: Net 0.769, Stratum 0.01, Target 0.01 TTF @ 3360.70 kh/s: block 16m23s, share 0m12s Net TTF @ 91.80 Mh/s: 0m36s [2020-02-06 18:24:10] New block 6209527, job cb43 anime: miningbase.tk:3033 Diff: Net 0.769, Stratum 0.01, Target 0.01 TTF @ 3378.88 kh/s: block 16m18s, share 0m12s Net TTF @ 113.96 Mh/s: 0m29s [2020-02-06 18:24:14] 4 submitted by thread 9, lane 1, job cb43 [2020-02-06 18:24:14] 4 Accepted 2 S0 R2 B0, 29.087 sec (4ms) Diff 0.0865 (11.2), Block 6209527, Job cb43 [2020-02-06 18:24:19] New block 6209528, job cb44 anime: miningbase.tk:3033 Diff: Net 0.769, Stratum 0.01, Target 0.01 TTF @ 3386.67 kh/s: block 16m15s, share 0m12s Net TTF @ 150.22 Mh/s: 0m22s [2020-02-06 18:24:20] 5 submitted by thread 2, lane 1, job cb44 [2020-02-06 18:24:20] 5 Accepted 3 S0 R2 B0, 5.767 sec (4ms) Diff 0.0677 (8.8), Block 6209528, Job cb44

I don't know if it's significant because the submit logs are missing so I can't compare.

That's how it is logged in v3.12.0 avx2 windows build:

k

And on Ubuntu (mine build of v3.12.0 with your build.sh):

5

Please note I do not substring anything to "..." by myself or by reducing console windows size.

So I cannot send you the data which is... invisible :-P in avx2 build and does not exist in other builds.

Seems that I'll have to remove all substrings with "..." from the source code to proceed further.

JayDDee commented 4 years ago

What is the stratum diff when you use aes-avx with no rejects?

You don't necessaruilly have to control the stratum dif just observe what it is when things wotk or don't work.

This is also important when you successfully tested other miners what stratum diff were they using?

YetAnotherRussian commented 4 years ago

Diff seems to be the same, around 0.01 (I think pool sets it lower than for GPU miners, depending on the user agent string). Any cpu miner gets 0.01 startum diff.

This log was partially cut-off:

[2020-02-06 21:00:24] Stratum connection established [2020-02-06 21:00:24] New stratum diff 0.01, block 6209826, job ccaa anime: miningbase.tk:3033 Diff: Net 0.679, Stratum 0.01, Target 0.01 [2020-02-06 21:00:28] New job ccab [2020-02-06 21:00:33] 1 submitted by thread 1, job ccab [2020-02-06 21:00:33] 1 Accepted 1 S0 R0 B0, 10.264 sec (6ms) [2020-02-06 21:00:40] Hash[7:0}: 0000000b 405d1dda 8ac190dd 81f40dff ee4fcac6 696727d0 7cade805 6e2fc6e6 Diff 0.0889 (13.1), Block 6209826, Job ccab [2020-02-06 21:00:40] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 21:00:53] 2 submitted by thread 2, job ccab [2020-02-06 21:00:53] 2 Accepted 2 S0 R0 B0, 19.891 sec (4ms) Diff 0.0128 (1.88), Block 6209826, Job ccab [2020-02-06 21:01:39] 4 submitted by thread 8, job ccab [2020-02-06 21:02:17] Hash[7:0}: 0000004e 4ad01894 76044dce fe7ac611 9554efcf a475ce63 7610f5bf 5953d6be [2020-02-06 21:01:28] 3 submitted by thread 3, job ccab [2020-02-06 21:02:17] Hash[7:0}: 00000050 e4cf9e5c 02b3ea59 72abcf5d a841d865 c320bed4 65ba3cc5 06b23f9d [2020-02-06 21:02:17] Hash[7:0}: 00000054 cc4369ae a973e261 2e517c15 af5bec30 7e010347 1bac6b68 1c3beeac [2020-02-06 21:02:17] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 21:02:17] 3 Accepted 3 S0 R0 B0, 34.689 sec (49233ms) Diff 0.0124 (1.82), Block 6209826, Job ccab [2020-02-06 21:02:17] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 21:02:17] 5 submitted by thread 6, job ccab [2020-02-06 21:02:17] Hash[7:0}: 00000002 645bcf8e 7d5d157d c4e82053 9f56de3b d244c4cb 5355805b c1be86b8 [2020-02-06 21:02:17] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 21:02:17] New job ccac [2020-02-06 21:02:17] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 21:02:30] 6 submitted by thread 11, job ccae [2020-02-06 21:02:30] 6 Accepted 4 S2 R0 B0, 12.547 sec (5ms) [2020-02-06 21:02:30] Hash[7:0}: 00000063 6fd0672e ea78d4f7 91f85e76 6df045aa d865d4db 33626580 a208eb6a Diff 0.0101 (1.48), Block 6209828, Job ccae [2020-02-06 21:02:30] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 21:02:35] 7 Accepted 5 S2 R0 B0, 5.828 sec (4ms) [2020-02-06 21:02:36] Hash[7:0}: 0000000b 4d6660bd 32558052 b6f59ded 5eb34021 ff235637 5dd605af 262a8679 Diff 0.0885 (13), Block 6209828, Job ccae [2020-02-06 21:02:36] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 21:02:45] 8 Accepted 6 S2 R0 B0, 10.050 sec (5ms) [2020-02-06 21:02:46] Hash[7:0}: 00000016 4192a466 be13e9df 5acf6c1e fe00f2c2 a09a9180 5075345c 338ec1a3 [2020-02-06 21:02:46] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000

So it's 0.01. I do not see any stratum diff changes during several hours.

JayDDee commented 4 years ago

If I understand correctly, with aes-avx and stratum diif 0.01 you get no rejects but with avx2 and the same stratum diff you get occasional low diff.

Is that correct? This is not what I expected.

I really need to capture the hash and target at submit time for a share that will be rejected. Do to my mistake it doesn't get displayed with the avx2 build.

It will have to wait until I can release a fix or, if your willing, you can make the code change and compile and test on Linux. The change is very simple, it involves copying a few lines of code from one function to another.

In file cpu-miner.c search for function submit_solution.

It contains the following block of code:

if ( lowdiff_debug ) { uint32_t h = (uint32_t)hash; uint32_t t = (uint32_t)work->target; applog(LOG_INFO,"Hash[7:0}: %08x %08x %08x %08x %08x %08x %08x %08x", h[7],h[6],h[5],h[4],h[3],h[2],h[1],h[0]); applog(LOG_INFO,"Targ[7:0}: %08x %08x %08x %08x %08x %08x %08x %08x", t[7],t[6],t[5],t[4],t[3],t[2],t[1],t[0]); }

Immediately following submit_solution is an almost identical function submit_lane_solution.

Copy the missing block of code from submit_solution to submit_lane_solution so they look identical except for the lane argument. It should look like this:

bool submit_lane_solution( struct work work, const void hash, struct thr_info *thr, const int lane ) { if ( likely( submit_work( thr, work ) ) ) { submitted_share_count++; work_set_target_ratio( work, hash ); if ( !opt_quiet ) applog( LOG_NOTICE, "%d submitted by thread %d, lane %d, job %s", submitted_share_count, thr->id, lane, work->job_id );

if ( lowdiff_debug ) { uint32_t h = (uint32_t)hash; uint32_t t = (uint32_t)work->target; applog(LOG_INFO,"Hash[7:0}: %08x %08x %08x %08x %08x %08x %08x %08x", h[7],h[6],h[5],h[4],h[3],h[2],h[1],h[0]); applog(LOG_INFO,"Targ[7:0}: %08x %08x %08x %08x %08x %08x %08x %08x", t[7],t[6],t[5],t[4],t[3],t[2],t[1],t[0]); } return true; } else applog( LOG_WARNING, "%d failed to submit share.", submitted_share_count ); return false; }

You can also just copy and paste the above to replace the existing function.

If you want to give it a try I can help walk you through it. It's faster than waiting for a new release.

YetAnotherRussian commented 4 years ago

If I understand correctly, with aes-avx and stratum diif 0.01 you get no rejects but with avx2 and the same stratum diff you get occasional low diff. Is that correct? This is not what I expected.

Exactly! These rejects sometimes are as high as 30-35% of the total share count. All avx2-enabled builds affected.


I'm not on the C/C++ side fairly speaking (work in C# .NET environment in VS only, and ofc web development), but I could try to debug on the submit/recieve side through web proxy (Win env).

Upd:

AVX2 version output with your fix:

[2020-02-06 11:48:54] Hash[7:0}: 00000040 69c6c355 315dfbf4 4d504bb6 992736b0 d07e2f56 7ed5c73f a7d65cf3 [2020-02-06 11:48:54] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 11:48:54] 1 Accepted 1 S0 R0 B0, 25.156 sec (3ms) Diff 0.0155 (2.03%), Block 6210038, Job cda9 [2020-02-06 11:49:17] New block 6210039, job cdaa anime: miningbase.tk:3033 Diff: Net 0.766, Stratum 0.01, Target 0.01 TTF @ 2466.84 kh/s: block 22m13s, share 0m17s Net TTF @ 137.02 Mh/s: 0m24s [2020-02-06 11:49:30] 2 submitted by thread 2, lane 0, job cdaa [2020-02-06 11:49:30] Hash[7:0}: 0000000c 17fff654 173375e1 b4d00d1d 8ef8c030 6dd43c73 57fa0882 4457eb74 [2020-02-06 11:49:30] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 11:49:30] 2 A1 S0 Rejected 1 B0, 36.639 sec (4ms) Diff 0.0827 (10.8%), Block 6210039, Job cdaa [2020-02-06 11:49:30] Reject reason: Low difficulty share Share diff: 0.0826861, Hash: 0000000c17fff65400000000... Target diff: 0.0826861, Targ: 00000063ff9c000000000000... [2020-02-06 11:49:41] New block 6210040, job cdab anime: miningbase.tk:3033 Diff: Net 0.766, Stratum 0.01, Target 0.01 TTF @ 2473.05 kh/s: block 22m09s, share 0m17s Net TTF @ 137.02 Mh/s: 0m24s [2020-02-06 11:49:54] 3 submitted by thread 5, lane 1, job cdab [2020-02-06 11:49:54] Hash[7:0}: 0000004f bf56d80f 4c304fd5 52d05495 954f23b3 a906b7c8 c7f30b11 1916e80e [2020-02-06 11:49:54] Targ[7:0}: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 11:49:54] 3 Accepted 2 S0 R1 B0, 23.378 sec (3ms) Diff 0.0125 (1.64%), Block 6210040, Job cdab [2020-02-06 11:50:00] New block 6210041, job cdac

op

JayDDee commented 4 years ago

Good data!

I don't know why it was rejected, everything looks good. Even in the reject report the hash is below target and the share diff is >= to the target diff.

Since there us a clear point of deviation between AVX and AVX2 I have another request. Swap the submit functions between them.

In algo/quark/anime.c (used by AVX) replace submit_solution with submit_lane_solution, use any number for the lane arg.

in algo/quark/anime-4way.c (used by AVX2) replace submit_lane_soluttion with submit_solution.

This is the only code that is different between them that also affects share submission. The hashing code is also different but it has no effect on how the hash is submitted.

If that doesn't produce a breakthrough I would like a longer test with more samples so I can look forpatterns inthe hash that is accepted vs rejected. It doesn't have to be very long but 4 samples isn't enough.

JayDDee commented 4 years ago

I didn't notice you posted a url to test with. I'm testing now and reproduced the problem. I think I can take over from here.

I was able to set the startum diff by adding d=0.08 to the password but I still see low diff shares.

JayDDee commented 4 years ago

These 2 shares tell an interesting story. The target diff is 0.01 yet a share with diff 0.0352 was rejected while a share with diff 0.0117 was accepted.

Rejecting a share with 3x the target diff is not a small error, and its inconsistent at that.

It defies logic And then it's only the AVX2 build, AES-AVX works fine.

I'm going to swap submit functions now.

[2020-02-06 18:18:43] 12 submitted by thread 4, lane 1, job cfaa [2020-02-06 18:18:43] Hash[7:0]: 00000055 696c2241 e49fb567 2ae48985 55d78094 7b2609ef bb1c4987 f7710f2c [2020-02-06 18:18:43] Targ[7:0]: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 18:18:43] 12 Accepted 8 S0 R4 B0, 2.763 sec (136ms) Diff 0.011708 (2.06%), Block 6210467, Job cfaa [2020-02-06 18:18:58] 13 submitted by thread 0, lane 2, job cfaa [2020-02-06 18:18:58] Hash[7:0]: 0000001c 66a6d643 7ec434c4 d1c87a8c 99ba9bc3 1586aab8 eca0f848 38b52f2e [2020-02-06 18:18:58] Targ[7:0]: 00000063 ff9c0000 00000000 00000000 00000000 00000000 00000000 00000000 [2020-02-06 18:18:58] 13 A8 S0 Rejected 5 B0, 14.972 sec (138ms) Diff 0.03521 (6.19%), Block 6210467, Job cfaa [2020-02-06 18:18:58] Reject reason: Low difficulty share Share diff: 0.0352095, Hash: 0000001c66a6d64300000000... Target diff: 0.0352095, Targ: 00000063ff9c000000000000...

JayDDee commented 4 years ago

I'm starting to suspect it's a pool issue. Is there another pool you can try?

Except for the fact that the avx build works everything else points to a pool issue rejecting valid shares. From all the data the miner is only submitting valid shares, I saw one with share diff 50x the target. That's ridiculous.

The shares are actually valid hashes otherwise they would be rejected as invalid. That the rejection was for low diffulty means it was good but not good enough to hit the target.

The way this works in the miner is each algo has a hard coded target factor. The pool sends stratum diff and the miner uses it and the target factor to calculate the target diff and target hash,

The target hash is used to validate the share hash, if it passes the nonce for the share is submitted.

The pool hashes the nonce, which is just part of the original data that the miner can play with, to determine if the hash is valid. Only if the first test [asses does it check the difficulty and job id. If it's rejected for low diff or stale job it means the hash was otherwise valid.

It's statistically impossible for the miner to submit a corrupt share that passes validation at the pool.

There's no pattern to the rejected share diff. Some very high diff shares get rejected but some that barely meet the target get accepted.

I've analyzed the miner code and there's nothing in the anime code or common code that can explain this.

I've tested with both the old and new hash test and it didn't make a difference. The old one is used by every almost every other miner.

If it wasn't for the fact the rejects don't occur with the AVX build I'd be absolutelty convinced it was a pool problem.

I don't know what to do from here. The only thing I can think of is to try another pool to see if it happens there.

JayDDee commented 4 years ago

Another interesting note is I modified both the AVX and AVX2 versions to use the new hash test and it made no difference. AVX still works and AVX2 doesn't. Nothing I have done has changed that.

YetAnotherRussian commented 4 years ago

@JayDDee maybe give a try to force the compiler not to vectorize some key places in the avx2 code when building to avx2 arch? This is rather simple, together with switching on vectorization report in GCC (I'm not sure about GCC report, as I use only Intel and MSVC compilers in my small and easy 20-lines-of-code C/C++ dllimports to C# code). As for my main language volatile things just do magic sometimes :-)

JayDDee commented 4 years ago

There's nothing special about anime, it's very similar to quark and many other algos. It's not a problem with the hash, the data proves that. All submitted shares are valid but some are being rejected by the pool, some of them orders of magnitude better than the target.

if the miner was really submitting low diff shares the accepted hash arte at the pool would be unaffected. In effect the miner would still be submitting a normal rate of valid shares but also share it should discard. You said the pool rate was lower that means the some of the miner's valid shares are being rejected.

A non vectorized version already exists, it's the AVX version. The compiler can't code parallel data streams in a single CPU thread, I did that and the compiler can't undo it.

As far as I'm concerned it's a pool issue until new data show otherwise.

YetAnotherRussian commented 4 years ago

I see something interesting on the main page of https://miningbase.tk, top-left corner:

cpuminer-opt has been updated, the bug with AVX2 fixed. See links section below.

Is this something you've already fixed for this algo the days before?

JayDDee commented 4 years ago

I know nothing about this. There's no mention of the version so I have no idea how long it's been there. v3.9.1 had such a fix.

BTW have you tried older versions of cpuminer-opt?

YetAnotherRussian commented 4 years ago

Haha, got it! Tested avx2-zen build history:

latest zen - invalid shares ... 3.9.3.1 - invalid shares 3.9.2.5 - invalid shares 3.9.2.4 - works 3.9.2.3 - works ... v3.9.1 zen - works (fix was made in this version)

Something was broken in version 3.9.2.5. Not a pool issue!

That test was done with a special bat file which is able to launch 12 versions with -t 2 in parallel (CPU almost died).

JayDDee commented 4 years ago

Good work. I'll dig in to the changes in v3.9.2.5

Confirmed it broke in v3.9.2.5. Interestingly there were no changes to anime targetted code in v3.9.2.5.

JayDDee commented 4 years ago

I have a fix.

It looks like a pair of vector utilities have their polarity reversed. I swapped their usage in anime-4way and it works.

A new release should be out in a few hours.

This will need follow up because other algos use it with no problems, meaning they have probably already compensated for the bug and the fix would break them.

There was some serious misdirection with this problem. I would have expected invalid shares considering the bug. Low difficulty led me away from the hashing code and toward the validation and submission. It was only that AVX worked flawlessly my mind was kept partially open. Your persistence paid off.

JayDDee commented 4 years ago

cpuminer-opt-3.12.1 is released. Plewase test and report any problems.

YetAnotherRussian commented 4 years ago

I confirm this algo is successfully fixed. Hope others were not affected (I just do not have credentials for all of those affected by the code changes in 3.12.1). Some of the algos among those do work. Thanks.

JayDDee commented 4 years ago

Te follow up will be done off line to ensure the utility is logically correct on all tagets and any algo using it in reverse polarity is logically corrected.

Thanks for your testing.Closing

JayDDee commented 4 years ago

Closing.

JayDDee commented 4 years ago

I have made the correction to the utilities, only AVX2 affected, and corrected the usage in quark and hmq1725, and reverted the anime usage to its previously "correct" usage. Wil be in the next release.

JayDDee commented 4 years ago

This issue doesn't want to seem to go away. It turns out one of the utilities is corect but the other always returns true. It doesn't cause rejected shares but it lowers performance.

The whole point of that code was to skip code whose results will be later discarded to save time. Testing has been difficult due to my CPU throttling causing hasrate fluctuations.

I'm testing a solution that seems to be working properly. No rejects and consistently higher hashrates that the current release.

More testing is require before release.

JayDDee commented 4 years ago

Another twist. I discovered "the right way to do it" using the intrinsic _mm256_movemask_epi8 to convert the 32 byte vector mask to a 32 bit integer mask. In AVX512 this is tranparent, the cmp instruction returns an integer bit mask, but AVX2 cmp returns a vector mask that must be explicitly converted to an integer bit mask for easier testing.

It seems to be working fine and the AVX2 code npow looks more like the AVX512 code.

Still testing.

JayDDee commented 4 years ago

cpuminer-opt-3.12.3 is released with a redesign of the code responsible for this issue

YetAnotherRussian commented 4 years ago

Great! Works as expected (with a visible hashrate improvement for avx2). Cannot test AVX512 build (no such CPUs).

JayDDee commented 4 years ago

I think I can finally close this for good.