shundhammer / qdirstat

QDirStat - Qt-based directory statistics (KDirStat without any KDE - from the original KDirStat author)
GNU General Public License v2.0
1.77k stars 125 forks source link

Show real-time changes, like SpaceSniffer? #183

Closed auwsom closed 3 years ago

auwsom commented 3 years ago

Hi, there is a great feature with SpaceSniffer (Windows) that shows the real-time changes on the file system after displaying the directories in a very similar way to QDirStat, by showing little flashes when a directory is written to. Is there any way to port that functionality? I know there is inotifywait, but it would be great to have a system-wide view of the filesystem (maybe minus a couple of cache directories).

It is on GItHub: https://github.com/marzwu/SpaceSniffer/ Wikipedia has a screenshot: https://en.wikipedia.org/wiki/SpaceSniffer

shundhammer commented 3 years ago

First and foremost: What is the use case for such a feature? Please elaborate.

shundhammer commented 3 years ago

Man page for inotify: https://linux.die.net/man/7/inotify

This is Linux specific; as far as I can tell, it is not available (at least there is no 1:1 counterpart) for BSD, let alone other Unix derivatives. That's a bad start.

This takes a watch descriptor; probably there is a limited number of them, and for sure this will consume system resources, and it will also affect performance.

The root filesystem of my Xubuntu 18.04 LTS has 29070 directories:

Xubuntu-18-04-LTS-root

I don't think it is feasible or even desirable to watch some 29k directories.

I know that SpaceSniffer always only presents the very trivial cases in their promo screenshots, even on Windows where there are only 10% of the number of files and directories that a typical Linux has; their way of annotating every single treemap tile only works in very trivial cases. It would break down completely even in my home directory. Even any of the web browser cache directories there would already completely overwhelm it. ;-) And I am pretty sure that it's no different for any directory watching showcase.

auwsom commented 3 years ago

@shundhammer Thanks for your reply!

The use case is to simply be able to see the changes taking place on someone's computer. I think that is very base and reasonable desire.

As for feasibility:

Thanks again for your response.

shundhammer commented 3 years ago

No, this is not a use case. Sorry, "it would be cool" is explicitly what a use case is not. ;-)

What (real-world!) problem would that solve for a user? What can a user do with this information that isn't possible without it?

shundhammer commented 3 years ago

As for the implementation: No, it's not easy. It would be a nightmare. The worst case for a program that holds a large in-memory database is when parts or that database keep changing all the time, when events are constantly coming in that cause part of the tree to be re-read, i.e. the old branch deleted and everything read fresh from disk.

This causes a ton of synchronization problems; it would also constantly throw the user off-track because things like the current scroll position in the tree view, the current treemap layout, what branches are collapsed or expaned in the tree view would constantly be changing. That would also be a user interface nightmare.

All that is bad enough when you do manual changes; i.e. when you delete (or move to the trash) one or more files from within QDirStat. But in that case, a user will have more appreciation because it was him who initiated that operation. But if it happens from the outside, e.g. because some process keeps writing to a directory, it would constantly do weird things at the worst possible moment.

Don't forget that SpaceSniffer is a very simplistic application that doesn't let you do too many things; the user interaction is very limited.

shundhammer commented 3 years ago

Also, for compromises that a user requesting a specific feature would be willing to accept, I can tell you from many, many experiences that for every user who is that understanding, there will come 100 blockheads who keep making demands, who will complain that it's not 100% perfect.

Compromises are never honored. Never ever.

Been there. Done that. Got the T-shirt; as a matter of fact, a whole drawer full of them. ;-)

shundhammer commented 3 years ago

As for /proc and /sys, QDirStat doesn't read any of those unless you explicitly request it. By default, it doesn't read any mounted filesystem, only the one that you started from. In most cases, it's not very useful to descend into mounted filesystems: When your root filesystem is full, you want to see what fills it and not be confused by other stuff on other filesystems.

There are operations like Continue Reading at Mount Point, of course; both after reading a directory tree and initially in the Open Directory dialog.

auwsom commented 3 years ago

Sorry my use case was vague, but yes I want a 'hammer'. Why? ..for all kinds of things that a narrow use case could never convey. I gave my current specific example about following installations. I could list changes with inotifywait, but wouldnt have any visualization of how each file relates to each other spatially beside a scrolling list that is human incomprehensible. A visualization like SpaceSniffer makes that data understandable in a way that no other std.out list could.

Being able to see the file system is not just "because it would be cool", it's a legitimate desired usage for many high level examinations of a system.

I realize you are asking for a 'use case' good enough in your mind to warrant the creation of the feature. It seems you are reticent to upset common users, understandably, but simple adding the feature with a button to enable it would solve that. I wasnt suggesting the feature should be default behavior. Opt-in not opt-out.

On a side note, what software do you use to design such a Linux application like QDirStat? Some kind of a Qt designer?

I didnt say implementation is easy. I proposed a simple solution to your feasibility concern about certain directories by excluding them.

Re: the /proc and /sys.. that's great. Then it's already set up to exclude certain locations :+1:

shundhammer commented 3 years ago
  • SpaceSniffer does work, so it is possible.

They track something in the order of 29k directories? I am certain that they don't.

  • I'm currently reorganizing my system (hence the need to see where things are installing, and on linux this could be a large number of locations, since few devs follow standards, which are loose to begin with. Is there a more detailed way to track where various tools are installing to using strace or other tools, maybe.. but is just be able to look directly look at most of the entire storage system quicker and more straigforward.. I would argue definitely yes.)

So you believe a GUI that flashes somewhere in its visualization when some process is writing somewhere would be helpful for that? I don't think so.

If you are interested where the files of a software package are installed to: That's already there in QDirStat with the packages view:

https://github.com/shundhammer/qdirstat/blob/master/doc/Pkg-View.md

Notice that this is strictly for the packaged files of a software package, not for files that it creates dynamically when you run the program.

shundhammer commented 3 years ago

I suggest that first of all you use the program the normal way to learn its capabilities.

For use cases such as finding out where a program writes its config files etc., start QDirStat with your home directory, then use "Discover" -> "Newest Files".

Of course, if many things are going on on your machine, you will find that list polluted by your web browser writing tons of data to your home directory for caches etc.; so you might want to close all browsers and possibly even clean their caches just before doing that. When things quiet down, you have a much better chance to identify the locations of files that you are interested in.

But that's exactly the same problem that an animated live view of ongoing disk writes would show you; such a live view would actually be even worse because it's just a momentary thing with no way to keep track of anything. It's just a fleeting moment, and gone the next moment.

shundhammer commented 3 years ago

On a side note, what software do you use to design such a Linux application like QDirStat? Some kind of a Qt designer?

Qt Designer (the Qt Designer that comes with Qt) for the dialogs, for the main window's outer frame, and for snippets inside it separately, so those parts are modularized: Many of them are in fact separate widget classes so they are also reusable.

Qt Designer is smarter than most other user interface builders: It creates an XML file that is the base for a user interface compiler (the uic command) to generate a C++ header file from to simply #include. That is code that you never have to touch (unlike that code that Visual Studio used to create when I last used it many years ago); yet it provides all checks for variable names and types, and you can use those variables directly. It's a great software design.

As for the rest... just look at the code. That might take a while; it multiplied like bunny rabbits over the years. ;-)

shundhammer commented 3 years ago

Sorry my use case was vague, but yes I want a 'hammer'.

You know: For people who only have a hammer every problem looks like a nail... ;-)

Why? ..for all kinds of things that a narrow use case could never convey. I gave my current specific example about following installations.

That's more like it; see above: Package view.

I could list changes with inotifywait, but wouldnt have any visualization of how each file relates to each other spatially beside a scrolling list that is human incomprehensible. A visualization like SpaceSniffer makes that data understandable in a way that no other std.out list could.

I fear you are greatly mistaken how useful such a visualization would be. You'd simply be drowned in information overload. All you would see is that a lot of stuff is going on.

Data visualization is not an easy thing. It totally depends on the use case; if the use case is unclear, the visualization can't be helpful.

I realize you are asking for a 'use case' good enough in your mind to warrant the creation of the feature. It seems you are reticent to upset common users, understandably, but simple adding the feature with a button to enable it would solve that. I wasnt suggesting the feature should be default behavior. Opt-in not opt-out.

Forget that concept. There is no such thing as "simple" in software design, much less in GUI design. A borked GUI design can easily break a piece of software. If a feature doesn't integrate nicely and naturally, it has no justification to exist in the program.

In GUI design, elegance is not reached when there is nothing more that you can think of to add; to the contrary, it is reached when there is nothing that can be taken away without hurting users.

QDirStat is constantly hovering on the verge of feature overload. Over the last couple of years I found that basically almost nobody ever seems to use most of the newer features; at least I heard nobody talk or write about any of them, not even the click-bait "reviewers" who write pages upon pages of content-less articles, mostly telling people the obvious, and things that are trivial to explain: How to install it (a part they copy and paste from their other seven dozen almost identical articles) and how to start it. None of them ever got around to write about the more advanced features, not even the mildly advanced ones.

That of course begs the question if they are useful at all, or if they should be dropped.

auwsom commented 3 years ago

Well, I have to say a big THANK YOU! (despite the cynicism about the value of seeing live changes) I didnt know QDirStat had this capability for following package installations. That is a huge help on Linux systems, so again thank you! for pointing that out. (Did KDirStat have this?)

Also, thank you for pointing out the 'time chunk' feature. That is also super helpful, not only for package installations, but for following other processes (and yes, can we drop the browser and other cache directories caveats? We've agreed several times that is a problem as well as stated solutions). Again, thank you. That is similar to some tools like Process Explorer and System Explorer (i think) I used to use on windows, both gui based with filters to exclude noise. Those tools are very noisy to start with because they show every process or system call. But that is the point. By starting with seeing everything and narrowing down with various filters of location, time, parent process, you name it.. (I'll come back to the visualization point in a second).

Thanks also for the Qt Designer info. That gives me a great place to start if I need to write some tools. :+1:

As for the storage changes visualization, I'll try to be brief, but this is one way for humans to interact with digital systems in a way that's understandable. One major difficulty is that 1s and 0s dont have a physical manifestation that is comparable to the world in which humans have evolved. By 'watching' a system with a visualization it grounds a user into something we have evolved to handle. The same reason we write higher and higher levels of programming languages from machine code until it comes close to speaking English. And I'd argue what is good for the gander.. for example when binary data is stored less fragmented (and more visualizeable) it is faster to access with disk drives. This should seem obvious, because this is the intent or value of what QDirStat is already doing. 'Seeing' your data. Adding real-time to that is a natural progress. Even if you think the noise or implementation is somehow insurmountable. And that the value is not worth the additional work involved. At least other may find these posts to continue forward. I will hand it to you that QDirStat is already an amazing tool. So, once again, thank you for that!

It would be great to have a QDirStat for processes. Do you know of a visualization for process trees? I view the data storage as the 'muscles and fat' of a system and the process, the nerves. It would be great be able to see them both visually.. to see the processes which are actively writing to which locations.

I think QDirStat is one of the most important tools for Unix now, beside some of the package listing commands like dpkg -l and dpkg -L name because of the integrated libraries and sparse storage locations inherent to it. Your pointing out the existing features above solved my initial problems, but thinking deeper about them, I still believe visualizing digital systems in a way similar to how we interact with physical systems will help us coevolve with computing systems.

auwsom commented 3 years ago

I would disagree about feature creep. I appreciate an intuitive UI, but that is not exclusive of having many feature (not ignoring that it is a challenge).

In fact, I would argue that my proposed design of mimicking physicality, can make features more intuitive and accessible. If I wanted a way to do a 'DNA extraction' to see the code behind a system, and was presented with a visualization of processes and storage interacting, it would be intuitive to filter and segregate visual representations based on activity, until I could correlate them with a desired target I was perturbing (package installation for example), or until I could see the top process of a 'nerve' tree and follow it down to where it was writing. That example would be for organizing my storage, but other 'use cases' come naturally and are also 'naturally' implement, not from the bottom up, but from the top down as emergent properties of being able to see the entirety of a complex system. For example, one could color code errors and error traces from multiple 'broken' applications to see if that they may be connected to the same problem. Also refactoring redundant OS code comes to mind. The list goes on..

auwsom commented 3 years ago

This package view is SO great! I feel like it just solved one of my main issues with Unix over Windows. Basically each one of these package locations acts as an exe file or whathaveyou.

shundhammer commented 3 years ago

Okay, glad to hear that somebody appreciates the package view. I thought I was the only one. ;-)

And that's the major factor behind Open Source software, and it is what in so many cases makes it superior to closed source commercial software: It's written by people who really want to use it themselves. Nerds write Open Source software for themselves; others get to benefit from it by pure luck, as a windfall. And the nerds fiddle with it and keep forever fine-tuning it.

So, if the package view raised your interested, you may also be interested in:

Experiment with them. Get creative.

shundhammer commented 3 years ago

If you are interested in visualizations, don't miss this: https://d3js.org/

They have some very cool ones, and you can all use them from their web site. They brought visualizations to a whole new art form.

Make sure to check out their examples about world bank funding for projects all over the world where you can see money streams across the globe over time; wealth and population of nations over time; and all kinds of other cool things. Click through their gallery and prepare to be amazed. ;-)

auwsom commented 3 years ago

Yes, I did see those as well, once you keyed me into them. I wouldn't have had a motivation to look under the file menu for something new after having used Windirstat and Kdirstat for so long. You might want to make those features icons on the main toolbar, or if you're going for a clean look, possibly make new menu category for them called 'tools' or something.

I couldnt get the file size tool to work. I waited to make sure I checked the package is the latest to let you know. I'm on Kubuntu 20.04. Theres no option for it, and the key shortcut in the GH repo discussing it doesnt trigger it.

One more comment about the GH repo. If this is the main site, you may want to 'feature' these features a little more prominently than a couple of hypertext links a couple of scrolls down in the info. Maybe a Features/Tools section in the table of contents. I had to use a page search to find it.

Thanks again!