ariccio / altWinDirStat

An unofficial modification of WinDirStat
Other
621 stars 40 forks source link

How does this thing go so fast? #19

Open lerlacher opened 6 years ago

lerlacher commented 6 years ago

Hi,

I am currently investigating a file sync issue where we had to exclude some folders from DFSR (Windows Fileshare Sync) because DFSR just cannot cope with the amount of files.

altWinDirStat takes not more than a few seconds to process the whole folder (whereas WinDirStat runs for several hours and then aborts, with a VC++ Runtime error dialogbox indicating that it was aborted).

What are the tricks altWinDirStat uses to index files this fast, and what is the simplest way I can leverage this to create a folder structure diffs between two servers, ideally from a script (Python or Powershell)?

I've clicked through to some of your docs but they don't seem to answer that basic question!

tabletguy commented 6 years ago

I long time ago, the developer of Everything (voidtools.com) explained how HE did the fast searches using the built in Windows directory services (I think -- not sure if that or another service). Anyway, he does offer an SDK with bundled DLL to do the same things. See https://voidtools.com I don't know if it's the same as altWinDirStat or not, but that would be the 1st place I would look. Or, google the technology.

assarbad commented 6 years ago

Turns out @ariccio found a number of performance bottlenecks in WinDirStat. One very impressive result was the slowness introduced by CMap as opposed to the corresponding STL class.

"Everything" (the software) actually uses MFT parsing. I've been contemplating this method as well and will probably add it eventually to WinDirStat, but only as an alternative method. The reason being that it's limited to NTFS and can be a bit brittle with the release of new NTFS versions (it relies on internals) and it requires certain privileges that normal users don't have. So making this the only or default method would outright preclude unprivileged users from using WinDirStat in any meaningful way.

A handful of improvements were already implemented in WDS, but some of the bigger ones not yet.

Look at some of the insights Alexander provided in issue #4 in response to my questions.

lerlacher commented 6 years ago

I've tried out Everything but it seems like it also falls over in my specific usecase!

This is where being able to export the results of altWinDirStat would be really great for me... It seems like Everything could do that, but like I said, it also falls over.

divinity76 commented 5 years ago

@lerlacher what do you "falls over" and what kind of data do you want exactly? something like

{
    "C:\\foo":{usagePercent:10,bytes:123},
    "C:\\bar":{usagePercent:30,bytes:1234}
}

?