LeweLotki / rusty-ripper

0 stars 0 forks source link

review requested by Michał P. #1

Closed micouy closed 2 days ago

micouy commented 4 days ago

you could add #[derive(Debug)] above every struct definition to make it printable for free

https://doc.rust-lang.org/stable/std/fmt/trait.Debug.html#examples


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/hasher.rs#L14 use enum


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/hasher.rs#L52

you could return slices to avoid cloning:

fn hash_tokens_in_parallel<'a, H>(&self, tokens: &'a Vec<String>) -> (Vec<&'a str>, Vec<String>)

if you want to convert Vec<&str> to Vec<String> I think you can use v.to_owned()

also this method could be a standalone function, it's not even public


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/hasher.rs#L27

in your design after I construct a Hasher, I need to remember to call load_hashes. I don't see a good reason for it - a Hasher without loaded hashes is not usable anyways. why not load hashes in the constructor?

if you feel like a constructor should return instantenously, maybe remove default hashes and tokens from the constructor and create a function load_hashes(dictionary, hash_function) -> Hasher?


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/hasher.rs#L66

you could use a custom Display implementation instead


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/hasher.rs#L84

I think this should be a method on the HashFunction enum


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/dictionary.rs#L21

again - why not load content in the constructor? why introduce the possible invalid state of an empty dictionary?

also, ideally you should use P: AsRef<Path> instead of String. see https://doc.rust-lang.org/stable/std/fs/struct.File.html#method.open


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/dictionary.rs#L29

https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/dictionary.rs#L33

this should all happen inside the constructor or a function which returns Result<Dictionary, YourCustomErrorType>. see Error


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/dictionary.rs#L37

again, maybe you could use Display?


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/dictionary.rs#L45

the ContentManager trait looks weird to me. I don't get the use case - why not just have a constructor and a Display implementation for each object?


analogous comments to passwords.rs


typo in "retriver", should be "retriever"


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/retriver.rs#L15

unnecessary clone: it seems like you wanted to take &Hasher and &Passwords as arguments and clone their fields. however, you're taking ownership over them, so you can't reuse the hasher and the passwords structs in the calling code:


let hasher = Hasher::new(...);
let passwords = Passwords::new(...);

let retriever = Retriever::new(hasher, passwords);

println!("{:?}", hasher); // Error - `hasher` has been moved into `Retriever::new`.
println!("{:?}", passwords); // Error - `passwords` has been moved into `Retriever::new`.

if you really want to take ownership over the arguments, you don't have to clone those fields - you can just assign:

Self {
  tokens: hasher.tokens,
  hashes: hasher.hashes,
  logins: passwords.logins,
  passwords: passwords.passwords,
}

...or - and I think it would be the best solution, unless it somehow limits how Retriever is used in your code - you could take references and store references:

pub fn new<'a>(hasher: &'a Hasher, passwords: &'a Passwords) -> Retriever<'a> {
    Self {
        tokens: &hasher.tokens,
        hashes: &hasher.hashes,
        logins: &passwords.logins,
        passwords: &passwords.passwords,
    }
}

https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/modes/retriver.rs#L24

I think having a run method is an anti pattern - why do you need the "object"? why not just have a function retrieve_logins(hasher: &Hasher, passwords: &Passwords) -> Vec<(String, String)>?


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/cli/mod.rs#L1

why not replace src/cli/mod.py and src/cli/parser.py with src/cli.py? I don't see a reason for a module with a single submodule


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/cli/parser.rs#L11-L20

did you check if clap has support for parsing PathBufs? you could replace the Strings here with them


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/cli/parser.rs#L32-L34

these lines could be just:

let dictionary = Dictionary::new(&dictionary_path);
// or
let dictionary = load_dictionary(&dictionary_path);
// or
let dictionary = Dictionary::from_file(&dictionary_path); // to indicate that it's not the default constructor and might take time

// then, if you implement `Display` for `Dictionary`

println!("{}", dictionary);

same for other structs.

it's really an antipattern to require both load and display for unrelated structs. the purpose of traits is to be able to treat instances of different structs the same way in some context - i.e. if different structs can be passed as an argument to a function.

here load means a different thing for each struct - Hasher::load performs computations on a (potentially) large vector while Dictionary::load reads a file and validates its contents.


https://github.com/LeweLotki/rusty-ripper/blob/f171c16be66901cec1cfeef83d9d520aee72f3ca/src/cli/parser.rs#L30

instead of parsing the combinations of flags, you could use subcommands like ripper dictionary <path>, ripper hasher <hash function> etc.


a general remark: I think you should return Result<T, E> in more places and handle the errors in cli.py

best, Mikołaj

LeweLotki commented 2 days ago

Thanks a lot for this review, I have applied most of your insights.