RFC, 'usage 101' documentation

jfoug commented 5 years ago

We seem to get a lot of 'how do I brute force XYZ' ? Often we direct the user to use inc mode (which is great). But for really doing a brute test, the user really wants 'mask' mode.

Now, we could do this by still keeping the terminology of 'mask', but by proper documentation (usage, ./john output, etc), listing that mask really 'means' brute force.

Now, to 'us' john developers, the 'meaning' of mask mode is very clear. BUT to johnny-come-lately end user, mask mode is certainly not clear. The terminology 'brute force' seems to be well known, so end users are looking for how it is done in john. All that this 'issue' (just an RFC) really is, is just asking if developers have any interest at all in this, or see this as beneficial at all for bumping up the signal to noise ratio on places like john-users mailing lists.

The 2 areas where john really 'does' have a brute force, are inc and mask. within inc, it is highly (hopefully) weighted, so that only the best of the best low hanging fruit passwords are tested, netting a large percentage of cracks. The mask mode is more unordered brute force, BUT allows additional 'gained intelligence' to be used also (such as pre-filling known parts, running against a wordlist, similar to rules, etc). They each are certainly a method of brute force under the hood, but wrapped with business logic that hopefully makes them a much better brute forcer.

So really all we might need to do, is to build a 'Brute Force' section or something in one of the usage files, demonstrate how to brute force all lower case, all lower/digits, all lower/digits/wordcap, and a custom. Then even within that section, point them to the 'full blown' documentation on mask.

I would think this would help alleviate some of the repetitive mail list replys, if nothing more than telling a user to 'see the "Brute Force" section of OPTIONS document or something like that.

If others deem this to be a non-issue, I will not be Bu77-hurt at all if this issue simply gets rejected and closed. I know there is feelings of education pointing a new user 'in the right direction'. But to me, it seems like too often a lot of time is spent handing this.

At a minimum, I can see benefit from these type locations (if we keep the mask mode name, but simply add 'Brute Force' in a documentation manner:

./john
- usage screen. It should be a VERY short and sweet tiny change).
FAQ
- we should possibly update FAQ or generate a FAQ-jumbo)
README
- possibly, but since there is little help here, other than just ./john or ./john -w=xxx usage, possibly not)
OPTIONS
- almost certainly. But here, would keep the --mask=???? syntax, and just list that this 'is' or can be used as the 'brute force' mode.
- The same type comment 'could' also be placed in -inc, but there to explain that its brute force where the permutation is not the 'human readable' order.
EXAMPLES
- almost certainly should be placed here in some manner, or at least wherever mask is listed, brute force talked about.

solardiz commented 5 years ago

I strongly oppose us adopting the words "brute force", except possibly in contexts clarifying that these words are confusing and what different things the person might have meant instead. Let's not rename mask mode.

solardiz commented 5 years ago

I think Jim's description actually demonstrates how misused and misunderstood "brute force" is. Jim says that it means specific things. I disagree. It doesn't mean anything that specific to me. It can be, and has been, said that running a wordlist against a hash is also brute-forcing it, because it's testing of candidate passwords rather than somehow magically reversing the hash through a cryptographic weakness. It can also be said that JtR doesn't have a pure brute-force mode, because none of its built-in modes search a full keyspace exhaustively in the most trivial order possible - they're all more advanced than that.

Because different people put different meanings into these words, I deliberately avoided using them in JtR so far. I don't want us to give up and start catering to a specific instance of the misunderstanding.

jfoug commented 5 years ago

Ok, I also someone oppose this (even though I posted it, lol).

We 'could' do this in a educational documenation way, pointing the user in the manner: "Ok, you know about brute force searching, here is how to do it much better" .... Then proceed talking about inc mode, what it is, why it is 'good'. Talking about mask mode, what it is, when it is great, etc. Talk about simple word list mode. Talk about wordlist+rules, etc.

Yes, I agree that the 'term' brute-force is really meaningless. Even simply trying u:admin pw:admin on a web site is a simple form of 'brute forcing' It may brute force a real user id, OR even a login

But I see many (especially @solardiz ) spending quite a bit of time babysitting users, typing pretty much the same things over and over again, trying to school them on to proper better usage of the tool, vs the 'brute force' mindset.

That was why i was thinking that IF we wanted to tackle this situation at all, that we simply make a small form of educational documentation, that enlightens the user from the non-specific 'brute force' method, to quality methods which are part of john.

Again, if we really do not want to pursue this, it's not a big deal to me. @solardiz , yes I know you have purposefully kept the program and documentation 'brute-force' free. But does that mean that we do not use this to our advantage, in the manner of Oh, you've heard about "BF", well let me tell you how to properly get the job done type of document

jfoug commented 5 years ago

Btw, @solardiz thanks a lot for chiming in. You seem to handle these requests more often than others, so you certainly have time investment to think about.

magnumripper commented 5 years ago

I've got the impression many users are reading how-to's and stuff that are like 10 years old. In particular, it seems users still abuse incremental mode for doing things mask mode does better and WAY easier.

This means

Users do read super old obsolete stuff they found with google (without noticing its age).
Users do not care to read a single letter of what we put in the doc directory.

Which means if we want to do something about this, we should add run-time warnings when "detecting" obsolete/ineffective use cases. I'm not very fond of hashcat's many run-time standard warnings (nor the ones we do have) but OTOH as long as all such newbie stuff can be disabled with a single john.conf option, I'd be good (eg. having DisableHelpfulNotes = Y or something in my john-local.conf).

Above all I think this issue comes many many pages down our priority list. We've got bugs, portability issues and other known problems we should fix before "wasting" time on this.

magnumripper commented 5 years ago

In particular, it seems users still abuse incremental mode for doing things mask mode does better and WAY easier.

...and next thing I read was https://www.openwall.com/lists/john-users/2019/02/04/1

jfoug commented 5 years ago

Users do read super old obsolete stuff they found with google (without noticing its age).

Agreed, but you also have to actually 'look' at what is IN the doc directory (or wiki or whatever is being used).

We have a lot of Readme.XXXX files which are wonderful. Up to date. follow along.
But we also have dox, which are pretty much core only (this is jumbo. IMHO, core 'specific' things should be named as such, like README-core if the README file was only core, (NOTE, I did not look at it so am not sure). If there are ones which are pure core, then they should have counterparts which are pure jumbo (in this example README-jumbo).
We have a lot of doc files which are not up to date. They are missing items, bells/whistles, etc of newer features.

Well, stepping back a bit, I also see:

we should add run-time warnings when "detecting" obsolete/ineffective use cases.

Complex, but damn! that might be a really GOOD thing. We would almost need to build a 'set' of DisableHelpfulNotes_Type_XXX, _XXY, ... then enable/disable each. Then also have a global, don't bother me with helpful hints flag.

There are several apps, which have some of these type warnings. Then you click a button (we could use a stdin prompt) and set some flag and never warn the user again. So when the next version comes out, there are 3 'new' flags due to better detection logic of 'questionable-poor' usage, the user would get a warning (possibly only 1 time), when the did that behavior (again, assuming that the DisableALLHelpfulWarnings = Y was not set.

...and next thing I read was https://www.openwall.com/lists/john-users/2019/02/04/1

That was actually the message which triggered this RFC. There has probably been 1 to 4 a month forever. Each n00b query slightly different than the other, but pretty much ALL are hey, I don't know crud about john, but if I could ONLY get it to properly brute force for me, then I could crack everything out there Well, they are quite 'that' blindly PEBKAC, but not all that far from it. That is why I was thinking adding 'click bait', calling it 'how2 brute force', then within there, pointing them in the REAL right direction. No, it is not going to tell them to simply run -mask=?b -min-length=8 --max=length=16 but will instead be a quick 'information' on how to efficiently search using john. That 'real' brute force command may be used as an example of how NOT to use the tool, along with some timing expectations (i.e. millions of years, even for raw MD5 on the worlds fastest GPU). But again, using something similar to click-bait for the currently ignorant user.

I do (VERY MUCH), like the helpful hints idea, BUT that is a large task, and is likely fraught with problems for quite a while.

And to both of you (hopefully @frank-dittrich and or @kholia will also chime in), thanks for playing along. Issues like this, even though boring as he11, do (hopefully) help keep people looking at code, and NOT having to spend as much time with n00b questions. Even if the frequency of the questions do not significantly change, would it not be easier to simply say, we have a document which shows you how to effectively find passwords using john. That file is found ..... and named ..... Please give that file a read, then reading that, if you still need more help getting specific targeted attack patterns generated, then please follow up on this thread. Using something very short and simple like that (which is almost a auto-reply), then directing them to a 'current' document which does give a 'john-usage-101' type overview, even if it has a stupid name like "BRUTEFORCING" or some other click-bait like that, and then later answering MUCH less ignorant questions, because the user has not had their eyes opened a bit, to me seems like a much better thing than spending lots of time initially trying to 'teach' them how / WHY on things like inc/mask.

I really do think a large percentage of the users would like to figure out how best to use the tools. Sure, there is always the person who simply 'forgot' the password on a compressed disk container. They do not care about john, or password cracking. They simply googled 'password break MY DISK TYPE' and one of the first things they found was john, so now they simply want it to 'work' for them. But even for them, if you tell them 'you really need to read this file, and then if you have questions, please ask', and let them spend 10 minutes reading. At least then they may ask a 'smarter' question, such as, ```my encr-disk password was '!@#415WestDumontDr' with some symbols and numbers appended to that (I think 4 or 5 of them). But prior to having their eyes opened about just 'what' is needed/useful in cracking with john, they would have likely just asked things like 'I lost my password. How can I get john to give it to me' If they ask a question such as the 415WestDumontDr one, we can easily help them with the 'syntax' of getting a quick running -mask run, and or syntax on how to convert their encr-disk blob into a john input hash. THOSE are the real questions they have. They simply do not know enough to be able to ask them.

jfoug commented 5 years ago

I changed the topic title, away from bruteforce. The email @solardiz just replied to, which @magnumripper linked in this thread was where the title 'came' from. But really this is more of a how do we help total n00b users become n00b++ where at least the questions they ask have some meat to them.

I do think the topic title still covers the 'internal' questionable usage messages. If we can easily determine that the single mode script is going to take 1000's or millions of years, then we may want to warn the user of that fact, AND point them to read the 'usage 101' or FAQ, or whatever we call it (Yes I know there is a FAQ, but it needs work).

magnumripper commented 5 years ago

BTW this is a huge problem that has bugged me for a while:

$ ls ../doc
AddressSanitizer-HOWTO.txt      LICENSE.mpi             README.Tezos
Auditing-Kerio-Connect.md       MARKOV                  README.apex
Auditing-Openfire.md            MASK                    README.bash-completion
AxCrypt-Auditing-HOWTO.md       MODES                   README.bitcoin
BUGS                    NETNTLM_README              README.coding-style
CHANGES                 OFFICE                  README.cprepair
CHANGES-jumbo               OPTIONS                 README.format-epi
CHANGES-jumbo.git           PRINCE                  README.gpg
CONFIG                  README                  README.ios7
CONTACT                 README-CUDA             README.keychain
COPYING                 README-DISTROS              README.keyring
CRAM-MD5.txt                README-MIC              README.keystore
CREDITS                 README-OPENCL               README.kwallet
CREDITS-jumbo               README-PDF              README.librexgen
DYNAMIC                 README-PST              README.mozilla
DYNAMIC_COMPILER_FORMATS.md     README-TACACS+              README.mpi
DYNAMIC_EXPRESSIONS         README-ZIP              README.pwsafe
DYNAMIC_SCRIPTING           README-ZTEX             README.ssh
DiskCryptor-HOWTO.md            README-krb5-18-23           RULES
ENCODINGS               README.7z2john.md           RULES-hashcat
EXAMPLES                README.Apple_DMG            Regen-Lost-Salts.txt
EXTERNAL                README.BitLocker            SIPcrack-LICENSE
FAQ                 README.Ethereum             SUBSETS
HACKING.md              README.FileVault2           SecureMode-tutorial.md
HDAA_README             README.FreeBSD              dynamic_history.txt
INSTALL                 README.IBM_AS400            john-1.7.9-jumbo-7-licensing-stats.txt
INSTALL-FEDORA              README.LUKS             john-1.7.9-jumbo-7-licensing.txt
INSTALL-UBUNTU              README.LotusNotes           libFuzzer-HOWTO.txt
Kerberos-Auditing-HOWTO.md      README.MinGW                pass_gen.Manifest
LICENSE                 README.RACF             pcap2john.readme

That's just WAY too many files. You wont even spot that FAQ one. We should keep doc as tidy as possible and add all format-specific files either to doc/formats subdirectory, or merge them to one single file (most of them are really really short!).

Another problem: although I despise unneeded file name extensions, having .txt on text files is a good idea, especially for GUI users (i.e. almost every people on earth except perhaps us three 🤣 ).

BTW yet another possibility could be to having doc look like this:

$ ls -F doc
FAQ.txt     files/      index.htm

That would require some effort, but not THAT much. With most docs in html format, we'd move a few of the GNU-style INSTALL, LICENSE and whatever ones to the base directory (still in plain text ASCII).

What we really need is more people. We could use someone that don't know squat about coding but could take on this task. We could use someone that loves testing stuff and annoying us with bug reports (well we do have @frank-dittrich for that already 😄 ). And above all we need more coders. Sometimes I wonder if it's my fault there are so few in our community 🙊

kholia commented 5 years ago

I have been thinking about consolidating our documentation for a while now.

I will read up on how other projects (Linux kernel?) do it.

I don't have any experience with this stuff so far.

magnumripper commented 5 years ago

BTW an alternative to using html is formatting all files as Markdown .md (a few already are like this). A great benefit with that is that they are still very readable without a proper markdown viewer.

https://github.com/magnumripper/JohnTheRipper/blob/bleeding-jumbo/doc/Auditing-Kerio-Connect.md

vs.

https://raw.githubusercontent.com/magnumripper/JohnTheRipper/bleeding-jumbo/doc/Auditing-Kerio-Connect.md

A downside is they can't link to other files canonically (or can they?)

kholia commented 5 years ago

Markdown files can link to each other pretty easily.

openwall / john

RFC, 'usage 101' documentation #3634