Closed psychocrypt closed 6 years ago
The code is highly optimized and assumes that FPU rounding mode is set to FE_DOWNWARD, you removed that line.
Do you tested the code under linux? In the current version it is not building. Maybe I missed the change but could you please point to the line where the FPU is used?
I tested it on Linux before, including Linux on ARM, but not with the latest changes. FPU is used in the call to _mm_sqrt_sd(). Try to fix the rounding mode first and if it still fails self-test, I'll look into it.
Edit: ARM branch is here, but it's old code: https://github.com/SChernykh/xmr-stak-cpu/tree/armv8-a
note: https://github.com/SChernykh/xmr-stak-cpu/pull/2/files/d345e518394591912131ea6f774a2937ff43dc43#r213098599 is not working, the self_test is still failing
Ok, I'll look into it. It should work on Linux.
I've submitted the fix. I don't know why it failed for you, but it compiled fine and passed self-test on Ubuntu 18.04 with g++ 7.3.0.
mhh it is still failing. Do I need to change anything in the config.txt.
Please keep in mind you did not check the return code of the self test. This means even if the hash is wrong the benchmark is performed. Do you checked the output?
This repository is for fiddling with the code and benchmarking, don't expect production quality here. Yes, it doesn't check self test result, it just prints "passed/failed" to stdout.
Regarding your issue. I tried to build it from scratch using the code as it is in the repository right now. No problems, self-test passed (see below). Maybe you still have some modifications in the code that break it?
user@user-VirtualBox:~$ git clone https://github.com/SChernykh/xmr-stak-cpu/
Cloning into 'xmr-stak-cpu'...
remote: Counting objects: 1238, done.
remote: Compressing objects: 100% (51/51), done.
remote: Total 1238 (delta 75), reused 111 (delta 73), pack-reused 1114
Receiving objects: 100% (1238/1238), 598.38 KiB | 1.09 MiB/s, done.
Resolving deltas: 100% (744/744), done.
user@user-VirtualBox:~$ cd xmr-stak-cpu/
user@user-VirtualBox:~/xmr-stak-cpu$ cmake . -DMICROHTTPD_ENABLE=OFF -DOpenSSL_ENABLE=OFF -DHWLOC_ENABLE=OFF
-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: /home/user/xmr-stak-cpu
user@user-VirtualBox:~/xmr-stak-cpu$ make
Scanning dependencies of target xmr-stak-c
[ 5%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/c_blake256.c.o
[ 11%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/c_groestl.c.o
[ 16%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/c_jh.c.o
[ 22%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/c_keccak.c.o
[ 27%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/c_skein.c.o
[ 33%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/soft_aes.c.o
[ 38%] Linking C static library libxmr-stak-c.a
[ 38%] Built target xmr-stak-c
Scanning dependencies of target xmr-stak-cpu
[ 44%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/cli-miner.cpp.o
[ 50%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/console.cpp.o
[ 55%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/executor.cpp.o
[ 61%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/httpd.cpp.o
[ 66%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/jconf.cpp.o
[ 72%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/jpsock.cpp.o
[ 77%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/minethd.cpp.o
[ 83%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/socket.cpp.o
[ 88%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/webdesign.cpp.o
[ 94%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/crypto/cryptonight_common.cpp.o
[100%] Linking CXX executable bin/xmr-stak-cpu
[100%] Built target xmr-stak-cpu
user@user-VirtualBox:~/xmr-stak-cpu$ bin/xmr-stak-cpu
[2018-08-27 22:43:28] : MEMORY ALLOC FAILED: mmap failed
[2018-08-27 22:43:29] : Cryptonight hash self-test passed.
[2018-08-27 22:43:29] : Running a 60 second benchmark...
[2018-08-27 22:43:29] : Starting single thread, affinity: 0.
[2018-08-27 22:43:29] : MEMORY ALLOC FAILED: mmap failed
^C
user@user-VirtualBox:~/xmr-stak-cpu$ g++ --version
g++ (Ubuntu 7.3.0-16ubuntu3) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
My directory was clean but to be 100% sure I started it equal to your provided output from scratch. It is still failing.
git clone https://github.com/SChernykh/xmr-stak-cpu/ .
Cloning into '.'...
remote: Counting objects: 1238, done.
remote: Compressing objects: 100% (51/51), done.
remote: Total 1238 (delta 75), reused 111 (delta 73), pack-reused 1114
Receiving objects: 100% (1238/1238), 598.38 KiB | 524.00 KiB/s, done.
Resolving deltas: 100% (744/744), done.
Checking connectivity... done.
[psychocrypt@x] ~/workspace/xmr-pow8Test [master] % cmake . -DMICROHTTPD_ENABLE=OFF -DOpenSSL_ENABLE=OFF -DHWLOC_ENABLE=OFF
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: /home/psychocrypt/workspace/xmr-pow8Test
[psychocrypt@x] ~/workspace/xmr-pow8Test [master] ?? % make -j
Scanning dependencies of target xmr-stak-c
[ 5%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/c_groestl.c.o
[ 11%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/soft_aes.c.o
[ 16%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/c_jh.c.o
[ 22%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/c_keccak.c.o
[ 27%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/c_skein.c.o
[ 33%] Building C object CMakeFiles/xmr-stak-c.dir/crypto/c_blake256.c.o
[ 38%] Linking C static library libxmr-stak-c.a
[ 38%] Built target xmr-stak-c
Scanning dependencies of target xmr-stak-cpu
[ 44%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/webdesign.cpp.o
[ 50%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/socket.cpp.o
[ 55%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/console.cpp.o
[ 61%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/crypto/cryptonight_common.cpp.o
[ 66%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/minethd.cpp.o
[ 72%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/executor.cpp.o
[ 77%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/jpsock.cpp.o
[ 83%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/cli-miner.cpp.o
[ 88%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/httpd.cpp.o
[ 94%] Building CXX object CMakeFiles/xmr-stak-cpu.dir/jconf.cpp.o
[100%] Linking CXX executable bin/xmr-stak-cpu
[100%] Built target xmr-stak-cpu
[psychocrypt@x] ~/workspace/xmr-pow8Test [master] ?? % bin/xmr-stak-cpu
[2018-08-27 22:53:15] : MEMORY ALLOC FAILED: mmap failed
[2018-08-27 22:53:15] : Cryptonight hash self-test (variant 2) failed.
[2018-08-27 22:53:15] : Running a 60 second benchmark...
[2018-08-27 22:53:15] : Starting single thread, affinity: 0.
[2018-08-27 22:53:15] : MEMORY ALLOC FAILED: mmap failed
^C
[psychocrypt@x] ~/workspace/xmr-pow8Test [master] ?? % g++ --version
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.6) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
So it's compiler version that makes the difference then. I guess there is undefined behavior in the code somewhere. Can you try to make the binary without optimizations and check it?
I tried now also O0
and it is still failing. I will switch now to different systems where I can test more compiler versions.
GCC 5.3.0, GCC 6.2.0 is failing (UBUNTU 14.04) GCC 7.2.0 is working
I might have a clue: https://gcc.gnu.org/wiki/FloatingPointMath Using std::fesetround requires "-frounding-math" command line switch for g++. Maybe GCC 7+ is not so sensitive to it.
Yep, reproduced it with g++ 5.4.0. I'll try to figure out where exactly it fails.
note: -frounding-math
is not solving the issue
The bug is definitely in sqrt calculations because when I replace them with the reference integer implementation from slow_hash.c, it works fine. I'll keep looking into why it fails.
Both SSE2 and regular floating point implementations from slow_hash.c work - this is good, there is no bug in Monero's pull request.
Found the issue: old gcc can't handle _addcarry_u64 and _subborrow_u64 properly. There is no undefined behavior in the code, it's a compiler bug.
This works:
if (x2 < sqrt_input) ++r;
//_addcarry_u64(_subborrow_u64(0, x2, sqrt_input, (unsigned long long int*)&x2), r, 0, (unsigned long long int*)&r);
This doesn't work:
//if (x2 < sqrt_input) ++r;
_addcarry_u64(_subborrow_u64(0, x2, sqrt_input, (unsigned long long int*)&x2), r, 0, (unsigned long long int*)&r);
And when it works, it works fine even without -frounding-math
Thx for the fast solution. I will try it tomorrow.
Submitted possible fix: https://github.com/SChernykh/xmr-stak-cpu/commit/66c2022f3f80081072a1a033341e6ea863935b2a - works fine for me.
Did it work for you? And can you also spend some time and review the reference variant 2 implementation (Monero pull request 4218)? We lack people who know how Cryptonight algorithm works and can review the code.
Thx the fix it is working. I am currently bussy with the nvidia and amd implementation but will try to spend some time into the review.
Did you look at https://github.com/SChernykh/xmr-stak-amd/ ? It has an OpenCL implementation, basically it can be taken for AMD cards without changes. It also has a version for NVIDIA cards inside, so it can run on NVIDIA too - good starting point for CUDA implementation.
Yes I saw the AMD implementation but need to implement it as CUDA kernel to get the full performance out of new generation nvidia gpus.
After I applied #2 to this repo I can run the code under linux. The problem is that the self test for
variant 2
(sqrt and int math) is not passing.Is the provided hash wrong?
Which functionality is the finale monero POW using? Is variant 2 used?