albertobsd / keyhunt

privkey hunt for crypto currencies that use secp256k1 elliptic curve
MIT License
669 stars 398 forks source link

Mac ARM compilation #244

Open Alexxino opened 1 year ago

Alexxino commented 1 year ago

Hola Alberto,

it is still impossible to compile or use this awesome tool in the new Mac architecture natively. I have read that you are working on a compatible version. How is that going ?

Using the make legacy I got an error at the end of the compilation:

[...]
sha3/keccak.c:131:24: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma GCC diagnostic pop
                       ^
32 warnings generated.
g++ -march=native -mtune=native -Wall -Wextra -Ofast -ftree-vectorize -c hashing.c -o hashing.o
clang: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated [-Wdeprecated]
hashing.c:1:10: fatal error: 'openssl/evp.h' file not found
#include <openssl/evp.h>
         ^~~~~~~~~~~~~~~
1 error generated.
make: *** [legacy] Error 1

I have tried installing openssl using brew or linking the binary as I saw in the internet but nothing works. Do you have any solution?

Thanks!

bane77111 commented 1 year ago

The only thing what works for me was to install UTM with Kali Linux and compile with legacy.

Alexxino commented 1 year ago

Yeah, virtualization and emulation is a "solution" to run the tool but the performance is horrible. Its like driving a Ferrari car with mini wheels.

@albertobsd If you tell me what do I need to do for compiling the Mac version, I can do it and provide it to the community.

eshbeata commented 1 year ago

@Alexxino use docker and docker-compose dont forget to give it more RAM on the docker settings

docker-compose.yaml

  web:
    container_name: btc_puzzle
    build: 
      context: ./
      dockerfile: Dockerfile
    working_dir: /app
    restart: unless-stopped    
    volumes:
      - ./:/app
    ports:
      - "7862:7861"

dockerfile


RUN apt-get update -y
RUN apt-get upgrade -y
RUN apt-get install build-essential -y
RUN apt-get install libssl-dev -y
RUN apt-get install libgmp-dev -y
RUN apt-get install git -y
RUN apt-get install vim -y

RUN git clone https://github.com/albertobsd/keyhunt
WORKDIR keyhunt
RUN make legacy

WORKDIR app
# Default Command to launch the Application
ENTRYPOINT ["tail", "-f", "/dev/null"]

Then go inside the container from terminal docker exec -it btc_puzzle bash

they you would go to the keyhunt cd /keyhunt

note it work with legacy not regular make

alicedb2 commented 1 year ago

I'm trying to do something similar. I have a MacBook Air M2. For now I'm able to compile keyhunt legacy running Kali inside UTM (i.e QEMU) with virtualization, so it runs native on ARM rather than emulating x86_64. It's really great and snappy, UTM is amazing, Kali boots to the desktop in seconds.

When compiling keyhunt legacy on native ARM, GCC resolves -march=native to -march=armv8-a+crc+lse+rcpc+rdma+dotprod+aes+sha3+fp16fml+sb+flagm+pauth. With it I get a good ~18 Tkeys/s on 1 thread.

┌──(alix㉿kali)-[~/keyhunt]
└─$ ./keyhunt_legacy -m bsgs -f tests/125.txt -b 125 -q -s 10 -B random -t 1
[+] Version 0.2.230519 Satoshi Quest (legacy), developed by AlbertoBSD
[+] Quiet thread output
[+] Stats output every 10 seconds
[+] Thread : 1
[+] Mode BSGS random
[+] Opening file tests/125.txt
[+] Added 1 points from file
[+] Bit Range 125
[+] -- from : 0x10000000000000000000000000000000
[+] -- to   : 0x20000000000000000000000000000000
[+] N = 0x100000000000
[+] Bloom filter for 4194304 elements : 14.38 MB
[+] Bloom filter for 131072 elements : 0.88 MB
[+] Bloom filter for 4096 elements : 0.88 MB
[+] Allocating 0.00 MB for 4096 bP Points
[+] processing 4194304/4194304 bP points : 100%     
[+] Making checkums .. ... done
[+] Sorting 4096 elements... Done!
[+] Total 6227633859723264 keys in 340 seconds: ~18 Tkeys/s (18316570175656 keys/s)

Not bad at all for running inside a VM, but I want more. I want keyhunt to use its optimized secp256k1 and not gmp256k1 based on libgmp that legacy uses. The first issue is that the optimized secp256k1 relies on Intel SSE instructions. You can see that Int.cpp uses #include <emmintrin.h> where SSE2 intrinsics are defined. Legacy uses libgmp with #include <gmp.h>. You can't use Intel SSE instructions on ARM, but modern ARM processors have something similar for fast SIMD operations called NEON. SSE and NEON do similar things but they are of course not equivalent. There are a few Intel SSE->ARM NEON instrinsics translators. I use sse2neon (https://github.com/DLTcollab/sse2neon) which is pretty much a drop-in replacement. I copied sse2neon.h inside secp256k1/ and replaced #include <emmintrin.h> with #include "sse2neon.h". With sse2neon the non-legacy code compiles further but fails with several

[...]
secp256k1/Int.h:223:32: error: ‘__builtin_ia32_addcarryx_u64’ was not declared in this scope
  223 | #define _addcarry_u64(a,b,c,d) __builtin_ia32_addcarryx_u64(a,b,c,(long long unsigned int*)d);
      |                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
[...]
secp256k1/Int.h:222:33: error: ‘__builtin_ia32_sbb_u64’ was not declared in this scope; did you mean ‘__builtin_nansf64’?
  222 | #define _subborrow_u64(a,b,c,d) __builtin_ia32_sbb_u64(a,b,c,(long long unsigned int*)d);
      |                                 ^~~~~~~~~~~~~~~~~~~~~~
[...]

Those are two other intrinsics defined in secp256k1/Int.h

#define _subborrow_u64(a,b,c,d) __builtin_ia32_sbb_u64(a,b,c,(long long unsigned int*)d);
#define _addcarry_u64(a,b,c,d) __builtin_ia32_addcarryx_u64(a,b,c,(long long unsigned int*)d);

From what I understand _subborrow_u64 and _addcarry_u64 are ADX intrinsics (whatever that is) and here they are replaced by x86 specific ones on non-windows machines. I can't seem to find any information anywhere about an equivalent for arm64/aarch64.

eshbeata commented 1 year ago

using MAC and legacy keyhunt on the docker im getting with memory 256 and threads 512 -t 512 -s 10 -q -k 256 Total 77543611702762799104 keys in 6640 seconds: ~11 Pkeys/s (11678254774512469 keys/s)

alicedb2 commented 1 year ago

That's roughly what I'm getting with 4 threads and -k 256.

./keyhunt_legacy -m bsgs -f tests/125.txt -b 125 -q -s 10 -B random -S -t 4 -k 256
[...]
[+] Total 2633937278942052352 keys in 240 seconds: ~10 Pkeys/s (10974738662258551 keys/s)

If I give UTM all 8 cores available on the M2 (4 performance/4 economy) and use -t 8 then I get roughly ~17 Pkeys/s.

For cpu-heavy tasks specifying more threads than the number of cores is counterproductive due to context switching. All Apple M1 and M2 processors have (at least) 8 total cores and you're saturating them with -t 512 anyway. How much do you get with -t 8?

bane77111 commented 11 months ago

Hola Alberto,

it is still impossible to compile or use this awesome tool in the new Mac architecture natively. I have read that you are working on a compatible version. How is that going ?

Using the make legacy I got an error at the end of the compilation:

[...]
sha3/keccak.c:131:24: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma GCC diagnostic pop
                       ^
32 warnings generated.
g++ -march=native -mtune=native -Wall -Wextra -Ofast -ftree-vectorize -c hashing.c -o hashing.o
clang: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated [-Wdeprecated]
hashing.c:1:10: fatal error: 'openssl/evp.h' file not found
#include <openssl/evp.h>
         ^~~~~~~~~~~~~~~
1 error generated.
make: *** [legacy] Error 1

I have tried installing openssl using brew or linking the binary as I saw in the internet but nothing works. Do you have any solution?

Thanks!

Check issue #268