aczid / crypto1_bs

Bitsliced Crypto-1 brute-forcer
200 stars 78 forks source link

Segmentation fault on cracking with Mac M1 #47

Closed michto36 closed 6 months ago

michto36 commented 1 year ago

I am on Mac OS M1 Ventura 13.1 and I have a segmentation fault error when the program begins to crack:

Found tag with uid 143944e5, collecting nonces for key B of block 4 (sector 1) using known key A a0a1a2a3a4a5 for block 0 (sector 0) There is 1795 nonces in file 0x143944e5_004B.txt, appending Collected 1927 nonces... leftover complexity 9348885905408 (~2^43.09) - press enter to start brute-force phase Collected 1939 nonces... leftover complexity 9348885905408 (~2^43.09) - initializing brute-force phase... Starting 8 threads to test 9348885905408 states using 64-way bitslicing Cracking... 0.00%zsh: segmentation fault ./libnfc_crypto1_crack a0a1a2a3a4a5 0 A 4 B

i have builded the Make file with either -mcpu=apple-m1 or -mcpu=apple-a14 but same result, seg fault.

If you need more information, just tell me what to do and I will post results here.

aczid commented 1 year ago

Sorry to hear that. I don't have hardware like that to test on so I'm afraid there's not much I can do for you. If you can run the program in a debugger to see which instruction causes the segfault maybe I'd be able to provide a hint at what's happening. Have you given this other project a try? https://github.com/nfc-tools/mfoc-hardnested/

michto36 commented 1 year ago

No worry, it's okay ! MacOS environment is pretty new for me so not easy !

i have run the libnfc_crypto1_crack process in LLDB ( GDB like debugger ) and that's the output:

There is 1940 nonces in file 0x143944e5_004B.txt, appending Collected 2072 nonces... leftover complexity 9348885905408 (~2^43.09) - press enter to start brute-force phase Collected 2084 nonces... leftover complexity 9348885905408 (~2^43.09) - initializing brute-force phase... Starting 8 threads to test 9348885905408 states using 64-way bitslicing Cracking... 0.00%Process 1362 stopped

  • thread 5, stop reason = EXC_BAD_ACCESS (code=1, address=0x10) frame #0: 0x0000000100007260 libnfc_crypto1_crackcrack_states_bitsliced + 1444 libnfc_crypto1_crackcrack_states_bitsliced: -> 0x100007260 <+1444>: ldp x9, x8, [x8, #0x8] 0x100007264 <+1448>: mov x23, #-0x1 0x100007268 <+1452>: cmp x9, x8 0x10000726c <+1456>: b.lo 0x10000728c ; <+1488> Target 0: (libnfc_crypto1_crack) stopped. (lldb)

I hope it can help someone ! personally not good enough to fix it myself.

for the link you gave me, they are facing the same segfault issue. ( edit: it's not accurately the same issue, and after compiling it on my Mac, and after several modifications on some files for MacOs compatibility with malloc.h issues , that works fine ! even in cracking process ! but I really prefer to use only libnfc_crypto1_crack to make it ... unfortunately I am not good enough to fix the code myself. )

I already have installed your crypto1_bs on another linux machine and it's okay, the same for the mfoc-hardnested, both work like a charm. I think it's related to MacOS environment and its memory configuration

( i will still continue to make some researches )

michto36 commented 1 year ago

after re compiling the source with Address Sanitizer i can have this output:

==1755==ERROR: AddressSanitizer: dynamic-stack-buffer-overflow on address 0x00016faeab90 at pc 0x0001004335c4 bp 0x00016fae5d70 sp 0x00016fae5d68 WRITE of size 8 at 0x00016faeab90 thread T3

0 0x1004335c0 in crack_states_bitsliced crypto1_bs_crack.c:64

#1 0x10042c19c in crack_states_thread libnfc_crypto1_crack.c:497
#2 0x18135d068 in _pthread_start+0x90 (libsystem_pthread.dylib:arm64e+0x7068)
#3 0x181357e28 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1e28)

Address 0x00016faeab90 is located in stack of thread T3 SUMMARY: AddressSanitizer: dynamic-stack-buffer-overflow crypto1_bs_crack.c:64 in crack_states_bitsliced Shadow bytes around the buggy address: 0x00702df7d520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00702df7d530: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00702df7d540: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00702df7d55Thread T3 created by T0 here:

0 0x100e9cc5c in wrap_pthread_create+0x54 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x38c5c)

#1 0x10042d7bc in main libnfc_crypto1_crack.c:728
#2 0x181033e4c  (<unknown module>)

==1755==ABORTING zsh: abort ./libnfc_crypto1_crack a0a1a2a3a4a5 0 A 4 B

hope it can help you

aczid commented 1 year ago

Thanks for doing a bit more investigating, and very nice that you got the MFOC code to work on that machine. Are you going to send them a patch to fix those malloc issues you described?

I think I can see what's going wrong with this code. As you can see in the ASAN output, the error occurs somewhere around line 64 in crack_states_bitsliced: https://github.com/aczid/crypto1_bs/blob/master/crypto1_bs_crack.c#L64 The lstate_p pointer should be aligned correctly, but it looks like the current code just assumes the pointer returned by malloc will always be aligned: https://github.com/aczid/crypto1_bs/blob/master/crypto1_bs_crack.c#L53

It looks like the code by piwi is a bit smarter than that and takes care to ensure the memory is indeed aligned: https://github.com/nfc-tools/mfoc-hardnested/blob/master/src/hardnested/hardnested_bf_core_AVX512.c#L90 (I'm not sure which version of the implementation is actually used on the M1 so please check when you try to fix this. It could also be this one: https://github.com/nfc-tools/mfoc-hardnested/blob/master/src/hardnested/hardnested_bf_core_NOSIMD.c#L84)

I'm sure that vector instructions do not work on memory that is incorrectly aligned, which leads to a segfault like you've ran in to. I hope you can experiment with the code a bit more and it would be great if you could suggest a patch that also works on non-ARM apple machines.

michto36 commented 1 year ago

Thank for your time. Unfortunately I'm not a C programmer so I have no experience with memory management .. i have tried several fixes but it still terminates with the buffer overflow error .. I can't really figure out where it goes wrong and why ..

I hope someone will be able to try it and eventually fix it !

I stay available to make some tests and give feedbacks if needing.

GSWXXN commented 1 year ago

Hello, I just wrote a cross-platform GUI for PN532, called NFCToolsGUI. If you happen to have the same device, you can try it out. If not, you can follow my compilation process to try compiling the HardNested part. I only used crypto1_bs to collect nonces, and then used the cropto1_bs for decryption because I thought it would be faster. This works well on my ARM MacBook :-)

danroc commented 6 months ago

I had the same problem and it seems that the problem comes from task[4]-task[3] not being a multiple of MAX_BITSLICES (file crypto1_bs_crack.c).

The following patch fixed it for me, I can open a PR if you are ok with it:

diff --git a/crypto1_bs_crack.c b/crypto1_bs_crack.c
index 176b50f..af99cf3 100644
--- a/crypto1_bs_crack.c
+++ b/crypto1_bs_crack.c
@@ -28,18 +28,20 @@ THE SOFTWARE.
 #endif
 #include "crypto1_bs_crack.h"

+#define DIV_ROUND_UP(x, y) (((x) + (y) - 1) / (y))
+
 inline uint64_t crack_states_bitsliced(uint32_t **task){
     // the idea to roll back the half-states before combining them was suggested/explained to me by bla
     // first we pre-bitslice all the even state bits and roll them back, then bitslice the odd bits and combine the two in the inner loop
     uint64_t key = -1;
 #ifdef EXACT_COUNT
     size_t bucket_states_tested = 0;
-    size_t bucket_size[(task[4]-task[3])/MAX_BITSLICES];
+    size_t bucket_size[DIV_ROUND_UP(task[4]-task[3], MAX_BITSLICES)];
 #else
     const size_t bucket_states_tested = (task[4]-task[3])*(task[2]-task[1]);
 #endif
     // bitslice all the even states
-    bitslice_t * restrict bitsliced_even_states[(task[4]-task[3])/MAX_BITSLICES];
+    bitslice_t * restrict bitsliced_even_states[DIV_ROUND_UP(task[4]-task[3], MAX_BITSLICES)];
     size_t bitsliced_blocks = 0;
     for(uint32_t const * restrict p_even = task[3]; p_even < task[4]; p_even+=MAX_BITSLICES){
 #ifdef __WIN32
aczid commented 6 months ago

Nice find! I had not considered that. Did that get it working on your mac? Yes, please make a PR for this.

danroc commented 6 months ago

Yes, I encountered the same error as @michto36, but I managed to resolve it using the patch provided above. I tested it on a M2 Pro.

I have submitted the pull request https://github.com/aczid/crypto1_bs/pull/49 with this fix.

iceman1001 commented 6 months ago

hm...
on some M1 / Apple... the pm3 hardnested has issues too. We use following division image

So the fix here was to make sure its a multiple of max_bitslices. Hm..

aczid commented 6 months ago

I think the +1 might save you here. But I still don't really understand what is going wrong. Please keep us posted.