openwall / john

John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
https://www.openwall.com/john/
Other
10.14k stars 2.08k forks source link

dynamic formats with high/variable number of iterations #827

Open frank-dittrich opened 9 years ago

frank-dittrich commented 9 years ago

I'm sure this has been discussed in the past, but I don't find that discussion anywhere.

If a format has a very high, but fixed iteration count, you could still create a dynamic format, even if it is somewhat cumbersome. But with a variable iteration count, you are lost. How hard would it be to implement some kind of variable iteration count? Lets say, a new flag that needs to be set, so that $$I1000 gets interpreted as iteration count 1000? May be you need some DO ... ENDDO logic, or even handling of more than one iteration count. In that case, I'd say: use the forst $I for the first (or outer) loop, etc.

Do I remember wrong? Where has this been discussed in the past? Should this be discussed on john-users for additional input?

jfoug commented 9 years ago

I have thought about this issue a LOT over the past few years. There are a couple of things needed to use this:

  1. control logic. Right now, dyna is simple an array of function pointers that all look the same void func(void)
  2. for control logic, you will also need boolean logic.
  3. you will also need arithmetic logic (to make it usable)
  4. you will need variables. We have the extra field variables which to this time have been unused. These would be adequate. We can also use the $Cx fields for some things (like fixed arithmetic), so they would also need to be viewed as const variables. $C0 might be set to 1, so that $F0-=$C0 would simply be $F0--
  5. you have to have some method of seeding these variables on a per hash bases (for variable items). Again, if we used the $Fx/$Cx vars, we have a way to do this.

I would also like to go beyond this, and allow these to also be used to convert to base-10, base-16, BASE-16, string, etc. I think that is already in code. The difference here, is now these would be treated as variable, so each time you dumped the string, it could contain a different value.

Yes, I do see this as a nice addition. I am not quite sure how high the ROI is for it, however. What formats are we talking about?

I see input encoding and buffer handling to be items where we have more formats which dyna can not touch without having custom code. I think adding things like hook functions for input and buffer layout to be much more useful, but very hard on their own, and it makes dyna able to handle these more complex hashes, BUT at the expense of having someone have to develop these hook functions specific to the hash requirements.

jfoug commented 9 years ago

Ok, here is one way to look at this. I am throwing out there, to ask questions, but also to help spur my own thoughts.

// NOTE not tested ;)

//$ ./pass_gen.pl  'dynamic=41'
//dynamic_41 -->$h-sha1($s.$p)->$h=sha1($p.$h)^1023    (sap-H sha1)
static DYNAMIC_primitive_funcp _Funcs_41[] =
{
    //MGF_INPUT_20_BYTE
    //MGF_SALTED
    //MGF_FLAT_BUFFERS
    //MGF_FLD1
    DynamicFunc__clean_input,
    DynamicFunc__append_salt2,
    DynamicFunc__append_keys,
    DynamicFunc__clean_input2,
    DynamicFunc__append_keys2,
    DynamicFunc__SHA1_crypt_input2_append_input1,
    DynamicFunc__getvar1_fld1,
    DynamicFunc__getvar2_CONST1,
    DynamicFunc__startloop_var1_decrement_to_var2,
    DynamicFunc__SHA1_crypt_input1_overwrite_input1_offset_keylen,
    DynamicFunc__endloop,
    DynamicFunc__SHA1_crypt_input1_to_output1_FINAL,
    NULL
};
static struct fmt_tests _Preloads_41[] =
{
    // {"{x-issha, 1024}hmiyJ2a/Z+HRpjQ37Osz+rYax9UxMjM0NTY3ODkwYWI=","OpenWall"}
    {"$dynamic_41$8668b22766bf67e1d1a63437eceb33fab61ac7d5$$F11024$HEX$313233343536373839306162","OpenWall"},
    {NULL}
};
static DYNAMIC_Constants _Const_41[] =
{
    {1, "2"},
    {0, NULL}
};

The things that are not done:

DynamicFunc__SHA1_crypt_input1_overwrite_input1_offset_keylen This would simply be a new dyna method. Nothing special about this one. It will simply perform the crypt, output the results at offset of length of key, and not update any length values.

DynamicFunc__getvar1_fld0 This would be new. I would create 4 variables. They can be set (or stored?) into the fld values. So there would be 40 getters and 40 putters: DynamicFunc__getvar1_fld0 DynamicFunc__getvar1_fld1 DynamicFunc__getvar12_fld0 ..... and same setters, DynamicFunc__putvar1_fld0 DynamicFunc__putvar1_fld1 up to DynamicFunc__putvar4_fld9 This would be done with #defines to build the functions with 'minimal' typing of code. The introduction of the putters may be more difficult. This would likely cause a lot of extra work to get it right with OMP. The getters would all be fine. They simply read from static data (to the salt), into the variables. Each thread can do that without affecting other threads.

Vars could also be read from CONST's as in DynamicFunc__getvar2_CONST1

DynamicFunc__decr_var1 Provides some minimal arithmetic (inc, dec at least). Not shown in this example. I had them, prior to adding the var2 and const, and the look being '_to_var2'

DynamicFunc__startloop_var1_decrement_to_var2 Build looping constructs from the variables. What will happen here, is the code simply finds the endloop, and executes the functions calls, updating the varx and comparing to the vary, ending when they are same.

DynamicFunc__endloop Marker for the end of the 'do-while' loop. NOTE this is a post-inc/dec loop, such as:

while (varX != varY) {
   run-instructions;
   varX operation= 1; (or some constant).
} 
magnumripper commented 9 years ago

@jfoug should we close this issue or keep it for upcoming compiler enhancements?

jfoug commented 9 years ago

Sorry, I was thinking of a different issue. Yes, lets keep this open. I am thinking of ways this can be done.

I already plan on adding 4 'general-purpose' variables to dyna. I plan on using these for things like this (counting). Also, storing temp computations, so that more complex expressions can be handled. The general purpose var might be used as a string, a hash residue, and uint64_t, etc. Not all at one time, but it should be able to handle any of those 'types' of data. Once it is set, however, that is what it is to be used for, until the format is 'closed'. But this is still turning around in my head a bit on how best to do this, and how best to utilize it.