mufeedvh / pdfrip

A multi-threaded PDF password cracking utility equipped with commonly encountered password format builders and dictionary attacks.
MIT License
589 stars 67 forks source link

Wordlist line is not valid UTF-8 handling. #8

Closed zasekle closed 11 months ago

zasekle commented 12 months ago

The application would fail if a wordlist had anything except UTF-8 encoding. Instead, it will now process all lines that are properly UTF-8 encoded and skip other lines. It will also print a warning to the user when it completes, notifying them about the number of skipped lines.

Pommaq commented 11 months ago

Considering some might want to use invalid UTF-8 (read: "random" bytes) as passwords I think it might be better to just avoid reading Strings from files instead, it would require some rework on how we print correct passwords in engine however since we attempt to present it as a string. Perhaps we could, with inspiration of this code, attempt to decode the password once it is successful and fallback to displaying it in hexadecimal? This would have the added benefit of us handling any future generators outputting invalid UTF-8 aswell.

zasekle commented 11 months ago

That sounds like a better solution than simply skipping the lines that are not UTF-8 encoded. So I changed the way it brings information in from using the read_line function which returns a String to using the read_until function which returns a vec[u8]. Then if the password is invalid utf8, it will print it to the user as hex.

mufeedvh commented 11 months ago

LGTM, merging this right away! :raised_hands: :heart: