twpayne / find-duplicates

Find duplicate files quickly.
MIT License
44 stars 1 forks source link

Improve directory walk performance #1

Open twpayne opened 11 months ago

twpayne commented 11 months ago

The command find seems to have much better performance than Go's filepath.WalkDir.

stapelberg indicated that bradfitz (no mentions to avoid spamming) investigated this as part of goimports and was able to significantly improve performance, maybe by using the right syscalls.

stefanobaghino commented 11 months ago

It looks like this is the relevant conversation about this. From what I understand, the problematic one was filepath.Walk. filepath.WalkDir improves a lot (although it's not as fast as find according to the conversation) as it "uses the new os.DirEntry [...] type to avoid a stat for every file".