Chris-Dobbins / listdupes

A duplicate file checking utility for the command-line, written in Python with no external dependencies.
BSD 2-Clause "Simplified" License
2 stars 0 forks source link

Notes on Modularizing #1

Closed Chris-Dobbins closed 2 years ago

Chris-Dobbins commented 2 years ago

Idea for further modularizing the app:

  1. Default to writing to a CSV with a fixed name.
  2. Optionally, output the CSV stream on stdout, allowing user to rename file via redirect. This would basically be filter functionality via arg.

Note that this would allow make_file_path_unique to be moved out of main if desired.

Remember that there isn't really a meaningfully Pure version of the main function. All it would do is:

  1. Take checksum_paths's return value as an arg.
  2. Sort it.
  3. Call find_dupes on it.
  4. Return find_dupe's value

So main probably shouldn't be pure, but it would be more usefully modular if it only ever needed to return one kind of output, a dictionary.

Ideally its return tuple shouldn't contain anything that wasn't useful for another generalized sort of app. So, does that mean it shouldn't pass through a filter arg?

What if instead of passing the whole args namespace object to the main function we made the main function take just one or two args. Like:

  1. args.starting_folder
  2. args.progress

So that we could:

  1. Call get_listdupes_args
  2. Pass args.starting_folder and args.progress to main.
  3. Test args.filter to see if main's output should be written to disk or streamed to stdout.

The advantage of this:

  1. The module would have main as a single convenience function developers could call with a path and which would return a dictionary.
  2. When used as a terminal command the app could behave in a more filter-like capacity. By outputting a stream of CSV. It could even potentially take a stream of paths as input
Chris-Dobbins commented 2 years ago

Version 6 incorporates these ideas in an improved form.