key truncated - Githubissues

tbox1911 commented 8 years ago

Hi,

I was able to compile all three tools, but when I use "./solve_bs" the key is truncated and the first 2 byte is zeroed.

./solve_bs 0xcafec0de.txt 0xcafec0de
Initializing BS crypto-1
Using 64-bit bitslices
Bitslicing rollback byte: 1f...
Bitslicing nonces...
Starting 6 threads to test 1418412964 states
Found key: **0000**61003333
Tested 3523606116 states

when I use ./solve_piwi_bs or ./solve_piwi the key is correctly recovered.

do you have and idea of why ?

Best regards

tbox1911 commented 8 years ago

I respond to myself

in file "solve_bs.c" line 29

change the line : printf("Found key: %012lx\n", key); by : printf("Found key: %012"llx"\n", key);

Hope it helps

iceman1001 commented 8 years ago

Yeah, I saw I missed that one in my previous commit. Fixed and added your -mmmx parameter in the latest PR.

aczid commented 8 years ago

Thanks for the report, @tbox1911 and thanks for addressing this, @iceman1001. Doesn't -mpopcnt imply SSE, which implies MMX? What required it?

iceman1001 commented 8 years ago

isn't -mpopcnt only allowing for popcount functionality? and mmx,sse,sse2,sse3 allows for those instructions aswell On my ubuntu compiling with -msse2 -msse -mmmx gives really fast solving for the provided file.

aczid commented 8 years ago

I don't see how it would matter over using -mpopcnt, but if it works it works. :) Maybe it should be -march=native?

iceman1001 commented 8 years ago

Try: gcc -Og -mpopcnt -mmmx -msse -msse2 -std=c99 solve_piwi_bs.c crypto1_bs.c crypto1_bs_crack.c -Icraptev1-v1.0 craptev1-v1.0/craptev1.c crapto1-v3.3/crapto1.c -o solve_piwi_bs -lpthread

vs gcc -Og -mpopcnt -std=c99 solve_piwi_bs.c crypto1_bs.c crypto1_bs_crack.c -Icraptev1-v1.0 craptev1-v1.0/craptev1.c crapto1-v3.3/crapto1.c -o solve_piwi_bs -lpthread

aczid commented 8 years ago

I measure no difference between these two on my i3 laptop, also -Og generates noticeably slower code than -Ofast / -O3.

aczid commented 8 years ago

Does -march=native give you the same performance as all those flags?

iceman1001 commented 8 years ago

Test:

params | res

Sample of compiler warning: crypto1_bs.c: In function ‘crypto1_bs_bit’: crypto1_bs.c:46:1: warning: MMX vector return without MMX enabled changes the ABI [-Wpsabi] inline const bitslice_value_t crypto1_bs_bit(const bitslice_value_t input, const bool is_encrypted){

iceman1001 commented 8 years ago

Might it be that your dev-env sets some flags per default?

iceman1001 commented 8 years ago

You are right, the -Og gives slower execution.

iceman1001 commented 8 years ago

http://pastebin.com/TQ0g2DVq

By the looks of it, I compared your three solvers, and different comilations settings. the bitsliced solver based on blapost's implementation seems to be a bit faster. I didn't measure blaposts solver.

aczid commented 8 years ago

Ok, so we go with -O3 -mmmx -march=native ? Your measurement for -O3 -mpopcnt is way faster, that doesn't look right. To get a fair comparison you should let the tool run the entire keyspace (you can force that by setting a wrong UID).

iceman1001 commented 8 years ago

aha, but I think we still go for the "-O3 -mpopcnt -march=native" instead.

aczid commented 8 years ago

All in all it doesn't seem to matter that much, what would be interesting is seeing the performance on a machine with AVX2.

iceman1001 commented 8 years ago

http://pastebin.com/ru3nY3TD Interesting to see the procentages of the threads, but still your original setting plus -march=native is faster in a test with key searchspace forced to 100% by giving wrong uid.

iceman1001 commented 8 years ago

@aczid I merged your patch into my fork. Needed some extra work for mingw but it looks like it compiles and works on both Mingw/Ubuntu14.04 now.
Well, there is a minor bug with the reporting of time in the printouts.

aczid commented 8 years ago

Closing this issue, can open a new one about compiler flags at a later time if needed.

aczid / crypto1_bs

key truncated #4

params | res