re-use this imp? - Githubissues

iceman1001 commented 8 years ago

Great work with this implementation.
I'm looking at it an wonder if this bitsliced imp of crypto1 can be used as the default imp instead of blapost's version?

osysltd commented 8 years ago

Totally agree, incredible work! And fully support Iceman's question :)

aczid commented 8 years ago

Hi @iceman1001 & @osysltd,

Thanks for your interest. That's fine by me, provided my part of the code - except for the .patch file - remains MIT licensed (which is compatible with the GPL license). However I think there is little benefit to using my implementation over bla's version unless you also plan to merge @pwpiwi's implementation of the new EV1 attack. As far as I know this is the only area where my implementation adds any significant benefit. Please correct me if I'm wrong about that, in which case I'm sure you have a good reason for wanting to include this and I'd like to know what it is. In the mean time I've removed the redundant copyright notices and cleaned up the patch file, please include the latest sources if you do.

iceman1001 commented 8 years ago

I've since long merged @pwpiwi's hardnested branch and I'm fine with getting a separate solver to run over the collected nonces. No direct need to merge yr BF but since you already provided with a .patch file.. I might go down that road. No, I was more interested in if your crypto1_bs is much faster then it could be used as for the nested attack or maybe for all mifare enc/dec parts.

iceman1001 commented 8 years ago

as a side note, I'm getting a bunch of warnings compiling on a ubuntu1404 system.

aczid commented 8 years ago

Oh I'm sorry, I couldn't find it in the @proxmark repository and thought that was the main distribution channel for bleeding edge. Feel free to use this how ever you like, and send those (fixed) warnings / errors as PRs or issues. (I have no issues building on Ubuntu 14.04.)

aczid commented 8 years ago

The crypto1_bs_crack.c/.h files are only used by the solve* binaries. The .patch file includes all that code, but it has been changed to match @pwpiwi's data structures and is GPL. The only files to add to a pm3 branch are the crypto1_bs.c/.h files.

aczid commented 8 years ago

@iceman1001: Bitslicing inherently favors parallel encryption performance over one-off encryptions, which is why I doubt that my implementation would be of any benefit over bla's implementation with its lookup tables etc, highly favorable to the one-off case. Really the only place my code matters is when running many encryption streams at once - as when brute-forcing.

iceman1001 commented 8 years ago

Well, when @pwpiwi decides to make a PR, I'm pretty sure it will be merged quite fast. The PM3 master is a stable branch. Meant to be reliable. If you want bleeding edge, look at @pwpiwi, @marshmellow42, @holiman and mine.

I do really like the bitslicing idea, there are other crypto-impl used in the pm3 code that would benefit from such ideas (like the iclass crypto, the hitag crypto)

When the warnings are fixed, I'll give it a try making a PR.
So addning crypto1_bs.c/.h might give some speed ups. cool. I just wrote it down on my TODO list.

I read your ATiny/PRESENT presentation, you do certainly know a lot about your crypto algos. finally, thank you for really great work.

aczid commented 8 years ago

Haha, thanks. Maybe I will release my Hitag implementations in the future. ;) I guess I should take a look at iClass then? :)

iceman1001 commented 8 years ago

looking bitslicing, it kind of reminds me of SAT solvers and how to formalize the sbox etc into CNF. Do have you, by some chance, gone down that road aswell?

Give me a shoutout when you decide to release the Hitag stuff ;)

aczid commented 8 years ago

@iceman1001 I do agree that the implementation of a bitsliced cracker and a specific instance of a generated SAT solver for that task are similar, but I learned that while SAT solvers will blindly try every possible combination, an implementation using some 'insider' logic that is probably hard to express in a normal SAT solver will have much better performance. The same gains can be made for other ciphers for which SAT solvers are the norm...

iceman1001 commented 8 years ago

The modern cryptominisat solver tries to make decisions, the reduce the whole search space by learning from previous choices. To put it in my words, it over my head, in how it works. Same goes for the STP (simple theorem prover).

I was just looking into this code yesterday, when I got news of your BF solver. Thats one other place where I think your imp also might work. https://github.com/iceman1001/mf_nonce_brute I'll have to look into how your crypto1_bs functions shall be called instead of the traditional crypto1 imp calls.

aczid commented 8 years ago

Yeah, that code looks like it could benefit.

iceman1001 commented 8 years ago

Very slow indeed.

iceman1001 commented 8 years ago

..and inside "hf mf nested" ... as found here > https://github.com/iceman1001/proxmark3/blob/master/client/mifarehost.c#L113 is where I also think your imp will increase the speed. What do you think?

aczid commented 8 years ago

I looked at it a bit more closely, and it looks to me like the lfsr_recovery functions use a divide-and-conquer strategy that is hard to translate to the bitsliced model. Basically, after only a few branches of the binary tree this algorithm traverses, you will be looking for only a few states in the bitsliced block. At that point the overhead becomes crippling, and I haven't yet managed to implement a model that overcomes this. My thinking is that something like a filter chain / pipeline model (where there are one or more threads at work for each 'depth' of the tree, passing it on to the next layer) might work. Overhead to synchronize between layers might still defeat the purpose, though.

iceman1001 commented 8 years ago

ok, then I don't have to think about it anymore. Is it the same for "/mf_nonce_brute.c" ? That one seems to be using rollback_byte mostly.

Got the patch into hardnested, doesn't like mingw very much. Issues with memalign.
The _aligned_malloc or _mingw_aligned_malloc doesn't seem to do the trick. Using the normal malloc makes your imp slow. Doubtfull its working.

aczid commented 8 years ago

Alignment of the data is a requirement to have the SSE operations work correctly. I'm not sure how to cope with this in mingw. I've tried putting the blocks on the stack, but it results in misaligned data.

aczid commented 8 years ago

I assume you also found this http://stackoverflow.com/questions/10862121/undefined-reference-to-posix-memalign-using-mingw32

aczid commented 8 years ago

mf_nonce_brute.c would still need a way to do the lfsr recovery in parallel right? I'm afraid there is little to gain otherwise.

iceman1001 commented 8 years ago

Yeah, that stackoverflow fails to mention that the _aligned_malloc / __mingw_aligned_malloc needs a libmingwex.a which comes from a package that I need to install. Almost there.

You are correct, it would be needed to do it in parallell. Ok, I'll drop those thoughts.

iceman1001 commented 8 years ago

Last comment about the mingw changes. The trick in aligned malloc was that the inparameter ar swapped compared with linux memalign call . That was the cause of my problems.

aczid / crypto1_bs

re-use this imp? #1