JayDDee / cpuminer-opt

Optimized multi algo CPU miner
Other
765 stars 543 forks source link

scrypt is slow #349

Closed scorpim closed 2 years ago

scorpim commented 2 years ago

Hi, I like your progam. good info, colorful.

I wonder. Since you are using poolers asm script. Is it actually implemented or it's just in the repo? I tested this and I found that it hashes with around 5,5 khs With poolers program it hashes 13 khs.

Is this a bug issue? I haven't compaired sha256 yet

JayDDee commented 2 years ago

More details please.

scorpim commented 2 years ago

Using Windows 10, Intel - SSE2

used this for this test, no reg needed: cpuminer-sse2 --hash-meter --algo=scrypt --url=stratum+tcp://litecoinp2pool.com:9327 --user=YOURWALLET --pass=ANY

minerd --algo=scrypt --url=stratum+tcp://litecoinp2pool.com:9327 --user=YOURWALLET --pass=ANY

opti pooler

Hope this helps

JayDDee commented 2 years ago

You need to test longer and wait for the reference hashrate in the summary log to stabilize. Check the wiki for an explanation of the hash rates. Also include the CPU info, it makes a big difference, ASM doesn't respect the compiler so if the CPU has AVX2 Pooler will use it even if the build is compiled for only SSE2.

scorpim commented 2 years ago

I have tested it for days actually. Even different pools. Let them run for 24 hours. It's always the same result. It's an old CPU, doesn't have AVX. I dont compile it my self, I just downloading the exe files pooler-cpuminer-2.5.1-win64.zip cpuminer-opt-3.19.1-windows

It was the same low hashrate with cpuminer-opt-3.18.2-windows

If I try to use any other program like cpuminer-avx or anything else then sse2. it crashes very badly.

Pooler wrote the code for gcc. I use Visual Studio 2022, thats ml64. The opcodes are reversed gcc : OPCODE source, destination masm : OPCODE destination, source

If the asm code works for pooler, why not for opti? Poolers program before the asm upgrade hade the very same low hashrate for crypto as it is for opti now. Thats why I wonder if its implemented.

May I give you a hint ? Intel... Why reinvent the wheel? :) https://github.com/intel/ipp-crypto/tree/develop/sources/ippcp/asm_intel64

The instruction set for my CPU is: GenuineIntel Pentium(R) Dual-Core CPU E5300 @ 2.60GHz 3DNOW not supported 3DNOWEXT not supported ABM not supported ADX not supported AES not supported AVX not supported AVX2 not supported AVX512CD not supported AVX512ER not supported AVX512F not supported AVX512PF not supported BMI1 not supported BMI2 not supported CLFSH supported CMPXCHG16B supported CX8 supported ERMS not supported F16C not supported FMA not supported FSGSBASE not supported FXSR supported HLE not supported INVPCID not supported LAHF supported LZCNT not supported MMX supported MMXEXT not supported MONITOR supported MOVBE not supported MSR supported OSXSAVE supported PCLMULQDQ not supported POPCNT not supported PREFETCHWT1 not supported RDRAND not supported RDSEED not supported RDTSCP not supported RTM not supported SEP supported SHA not supported SSE supported SSE2 supported SSE3 supported SSE4.1 not supported SSE4.2 not supported SSE4a not supported SSSE3 supported SYSCALL not supported TBM not supported XOP not supported XSAVE supported

The cpp code to produce this list is from Microsoft: https://docs.microsoft.com/en-us/cpp/intrinsics/cpuid-cpuidex

JayDDee commented 2 years ago

I rewrote the scrypt code in v3.18.0, before that the Pooler ASM code was used. My testing was focussed primarilly on AVX2 and AVX512. It appears the Pooler code is faster on older CPUs without AVX2.

That is an acceptible compromise IMO, the Pooler code is stil avaiable in cpuminer-multi as well as Pooler's own miner.

scorpim commented 2 years ago

Yes, you right. I just tested cpuminer-multi-rel1.3.1-x64. It runs on 12.1 khash You are also right about cpuminer-opt-3.17.1, just tested this too. 12.1 khash

I looked around on this page to see what miner programs and versions are used https://zpool.ca/algo/scrypt One miner is using the 3.17.1, now I know why he is not upgrading :)

Intel IPP has no scrypt file (I think) but has other cryptos. Who wknows better to write asm to intel cpus than Intel :)

I will downgrade to 3.17.1 and let it run for 24h strait, see what happens I like your program better. It has user friendly feedback. To bad you removed the VS project files to "junk"

If we are already here. May I ask if sha256 is the same or rewritten as well as crypt? (with sse2)

JayDDee commented 2 years ago

I tried replacing the Pooler sha256d but couldn't match it with intrinsics so they're only used for sha256t which Pooler doesn't support.

I believe I've implemented every optimizatrion from Pooler, it's hard to tell because ASM is so hard to read, and I've also implemented some not in Pooler, but I still can't match the Pooler code. Maybe I missed something or maybe the compiler is messing things up,

I also ran into this bizare issue #344.

scorpim commented 2 years ago

Yes, asm is hard to read. Specially if there are not much comments. "Figure it out on your own" This is the biggest problem developers facing and this is also the reason why everybody tries to come up with their own code. I do the same thing :)

That pdf file was quite havy to read. Geeks write's like that :) I only got the picture at the end with the C-code example :)

I dont know how mining programs or code's are working. I don't see through the code. Such as how to build the merkel from transactions etc. All information is written by geeks ! To much technical bla-bla-bla and no C code example. They are writing phyton code... I don't get that... Your program has a good text feedback. Something people understand. Easy...

When it comes to poolers asm, try to reverse it and start from scratch. Big problem, I know I was thinking to try intels compiler, to see if I can compile your project. I don't know if it understands gcc asm. I'm also getting errors with VS because the *.lib 's are compiled with a different compiler.

You could try to upgrade --benchmark Write a json file like "bench.btc.json" with the same info as you get from a pool. Except add one more variable like "result" Run it a million times and se if it matches up and how long it tooket. Easier to debug and test it from the beginning for different archs like sse, avx etc. You can upgread it later with "bench.ltc.json" etc. This would also give some confident to your users that your program really is working. They can run the benchmark them self too, right? cpuminer-opti --benchmark=bench.btc.json

I can only refer to intels ipp. The overview.md file describes how it works, specially the "dispacher". You want need to use 11 different programs. Imagine how difficult it is for you to maintain the code and debug. Thats why intel writes different file for each archs. The dispatcher decides what to use, depending the CPU. https://github.com/intel/ipp-crypto/tree/develop/sources/ippcp/asm_intel64 This is the file I would need "pcpsha256m7as.asm" => Optimized for processors with Intel® SSE3 I cant' find SSSE3 If poolers asm is difficult to read, try something else.

I was thinking to start my own project from scratch. In GUI mode, not console and only for Windows. Written in asm... including all algo's... CPU, GPU, everything, but... I don't get this mining stuff... Geeks are writing the info's!!!!! Damn geeks :) I rether read the phone book! :) You guys who understand this is WOW ! I'm very inpressed! :) Well, my project will be for my self, just for fun just to see if it's possible, maybe next year, and only if I have time. This is what we devs do, right? Pushing the limits... and never get payed :(

Correction! I run your program cpuminer-opt-3.17.1 since the afternoon and I made ONE coin. Emerald with scrypt. My share is 0,0015295 EMD = 0,00000004 BTC That is $ 0,0022464. wow I'm rich ! 445 afternoons and I can make a dollar! Ummm, how many afternoons I have till my retirement?..... Thank you for the coin!

emerald-1

By the way, how do I know what coin and block I mined? This makes me think to create a file to dump this info. One more thing on my todo list for my GUI program, next year...

JayDDee commented 2 years ago

I did some testing mining scryptn2 using the AVX build on a i5-2400 which only has AVX, and a i9-9940X that has AVX512. Scryptn2 was the focus of my optimizations because it is CPU mineable.

The results followed the same pattern as #344. v3.19.1 is faster on the i9 but v3.17.1 is faster on the i5-2400. Same build on different CPUs had different results. I'll have to do more testing to see if it's related to function definition order.

scorpim commented 2 years ago

I cant compile it to debug, but I was checking a bit around with file diff, both offline and here online You made a lot of changes

You added v3.19.1 to makefile algo/scrypt/scrypt-core-4way.c \

A lot of changes... I tried to trace it but its very difficult. I don't know what the defenitions like 2way,4way or the diff sha256d, sha256t etc... I have to read about these, that can take months. I just came back from PHP. Long time before that it was c/cpp/asm It feels as everything is new.... I need a good debuger, back to hacking...

Strange that i5 is faster with the old version? The i9 is not using something maybe?

Tip ? Intel's VTune Profiler https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler-download.html It includes oneAPI Base Toolkit (or the other way around) https://www.intel.com/content/www/us/en/developer/tools/oneapi/toolkits.html#base-kit

I had plans to use it somewhere in the future...

Off topic. If you running the program and unplug the network, you get 30 sec timeout. But the cpu is running on 100% Thats strange. I noticed this with poolers and with cpuminer-multi too. I'm playing around and mining, testing, learning this and I mined 4 coins and one of them was this http://explorer.69coin.org Block 14796 - 5a210ddc093a9744524c296c5a2db0e7be8294b9a0f835fcc0aa2064105810b3

Now this is the last block that was mined by me at Thu, 25 Nov 2021 10:10:50 GMT The program was still getting a "job" but there is no work. I let it run for 12 hours for doing nothing. No work but the CPU is running 100% ? The good thing is that I had a very nice temperature at home :)

What's going on? Testing it? cpuminer-sse2 --algo=scrypt --url=stratum+tcp://scrypt.eu.mine.zpool.ca:3433 -- user=YOURWALLET --pass=c=BTC,zap=ISN You get payed in BTC (in a hundered years) and mining for 69Coin (ISN) Yes a hundered years, you get BTC 0,00000010, the min payout is 0,015 One coin a day, thats 150 000 days, Your great great grand chilren will be greatful :) Cheers..

scorpim commented 2 years ago

You mentioned scryptn2, so I will let it run over night With my SSE2 I get 4,43 h/s Is this ok? It sound so low...

May I ask what S0 R0 B0 means after "Accepted"?

This is how it works for me.... Is this ok?

scryptn2-Subsidium

JayDDee commented 2 years ago

I tried changing the function definition order for srcypt as I did for sha256 in #344 but it didn't change anything.

I was also able to test on another Sandybridge CPU, a Xeon E5-1620, and there was no measuable difference between v3.19.1 and v3.17.1.

I don't see this issue moving forward. Trying to retune the code for these CPUs may have a negative effect on CPUs where the new code is faster. The Pooler code is always available in other miners.

JayDDee commented 2 years ago

Nothing further to add.