Open magnumripper opened 4 years ago
@magnumripper I don't fully understand FMT_HUGE_INPUT
. I notice we have it set where it's probably not needed - e.g., on DiskCryptor formats, where the "non-hashes" are of a fixed size of a little over 4 KB. Not being familiar with this flag, I didn't remove it, but perhaps you should.
We're using FMT_HUGE_INPUT
for any format potentially having ciphertexts longer than LINE_BUFFER_SIZE
and when we added it we reduced the latter macro to the original 0x400
.
This issue is about formats like zip, rar and 7z though. They can have truly huge ciphertexts, several gigabytes. We're never going to use the full db->(...)->source
anyway (the pot entry will be truncated and a hash of the full ciphertext will then be appended, for a total way below LINE_BUFFER_SIZE
) so it's very wasteful to store it in the db.
Actually, I can't recall all the details - maybe we already do things right.
Maybe we shouldn't make LINE_BUFFER_SIZE
as low as 0x400, but can allow e.g. 10x more than that, so that formats like DiskCryptor wouldn't need FMT_HUGE_INPUT
and would store the full "non-hashes" (in this case, a little over 4 KB, and there would be very few of those)? Would this perhaps be more convenient?
(In fact, DiskCryptor in particular currently decrypts only the first 96 bytes, so 192 hex characters. But the Python script extracts 2 KiB just in case, and hashcat now requires exactly that size, so we'd better not change this.)
We could set it to anything we want, but why bump it? The only magic in the format is the very FMT_HUGE_INPUT
flag, the rest is core stuff. I hate it when I tail john.pot and get a wall of hex scrolling by...
I was thinking that for only a few non-hashes of a few KB each, it's convenient to be able to match them against john.pot
lines manually if someone wants to - but from what you say, this isn't a universally shared preference.
I agree with your view as well but I find the current limit of 1024 a pretty balanced compromise: Longer hashes can usually still be matched visually/manually, looking at the first couple of hundred characters or so that are still there before it's truncated and a hash is appended.
As soon as we bump it to 4K we'll end up with someone wanting this for some other non-hash needing 8K - and soon we're back to the crazy size we had before FMT_HUGE_INPUT.
FMT_HUGE_INPUT should be used with non-default
fmt_source()
so we don't end up with the full ciphertext kept in memory for no use. I can't remember if we took care of that some other way, so need to review.