sreedevk / deduplicator

Filter, Sort & Delete Duplicate Files Recursively
MIT License
281 stars 15 forks source link

Switch to globwalk #37

Closed beeb closed 1 year ago

beeb commented 1 year ago

Not sure if you'll agree to this change, but I was not able to scan my full home directory with deduplicator, probably because it was trying to keep too many open file descriptors. globwalk seems to handle this much better (the default limits the number of open descriptors to a sane value). With this change I was able to scan my full home dir.

I have not tested it extensively but it seems to work for me, even with the --types argument.

sreedevk commented 1 year ago

Hey @beeb ! I've been trying to resolve this issue for a bit, I'm glad that you thought of the same thing too. I was trying to build a custom multi threaded directory walker (I couldn't get jwalk to work), but didn't really work with the progress indicator. I think this is definitely a step in the right path. I will review and merge this.

beeb commented 1 year ago

As a next step I think it could be good to switch to the Builder and set the BASE_PATH like in the example

sreedevk commented 1 year ago

@beeb I agree, we can also add options in deduplicator for --max-depth & --follow-links and directly map those options to the builder.