Open neilmayhew opened 2 years ago
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.
Tagging subscribers to this area: @dotnet/area-system-io See info in area-owners.md if you want to be subscribed.
Author: | neilmayhew |
---|---|
Assignees: | - |
Labels: | `area-System.IO`, `tenet-performance`, `untriaged` |
Milestone: | - |
@neilmayhew I agree that the current implementation is not perfect.
@tmds do you have any recommendations?
We can look at improving the implementation. Maybe some simpler changes can be made related to:
To make things worse, many ASP.net applications running in a container are incorrectly configured and use more FileSystemWatchers than they need to.
@neilmayhew can you give some more information about this?
@tmds Sorry for the slow reply.
Both of the blogs I linked to in the description mention ASP.net, and refer to a solution that looks like this:
configuration = new ConfigurationBuilder()
.SetBasePath(Directory.GetCurrentDirectory())
.AddJsonFile("appsettings.json", optional: true, reloadOnChange: false);
However, one of the comments on the first blog says that wasn't enough and links to a Stack Overflow question about WebApplicationFactory
that has a recursive solution. That in turn links to another question, which doesn't seem to be about ASP.net specifically. So I don't think this is exactly a problem with ASP.net, more about the default behavior of ConfigurationBuilder
, but it seems that the problem crops up more often with, or is made worse by, ASP.net.
A web search for the error message in question turns up a zillion hits, so a lot of people are encountering this, but previously no-one seems to have figured out the root of the problem and everyone is just finding ways to work around it.
It should be possible to watch thousands of files and directories on Linux without much overhead, but that's not possible with the way it's implemented currently.
The solution I'm thinking of is to use class variables for a common inotify instance and background thread, instead of instance variables and an instance and thread per FileSystemWatcher
object. The common thread would send events just like the single thread does, but it would need to keep a dictionary mapping from inotify watch descriptor to FileSystemWatcher
so it can obtain the .Net event it needs to send when it reads an inotify event from the inotify instance. This would be only slightly more complex than the current implementation, and the complexity would be concerned only with mutexing and keeping the dictionary updated when FileSystemWatcher
instances are created and destroyed.
BTW, I don't think it would be appropriate to switch the implementation to use fanotify
instead of inotify
. Although fanotify
is newer, it's designed for a different use case and isn't a replacement for inotify
. I'll update the description.
I see that the implementation already uses a mutexed dictionary for mapping watch descriptors to absolute paths (_wdToPathMap
) so a lot of the infrastructure I discussed is already in place and could be repurposed quite easily.
referenced pull request is not a proper fix, but a workaround... :(
My view on this is there is a fundamental limitation in the API of FileSystemWatcher that causes this issue: it can only watch a single path. And because of that, user's are drawn to create multiple FileSystemWatchers, to watch multiple directories. And since a single FileSystemWatcher holds a single IOCP port, or inotify queue, this is artifically inflating the number of such queues that the system needs to allocate.
In IKVM we had been using FileSystemWatcher as the backing support for our java.nio.file.WatchService support. But, we ran into this issue. WatchService in Java starts out pretty much the same: user's create an instance, and that instance owns a queue or port. However, user's then register and unregister multiple potentially unrelated paths to listen to with the service. So, in that ecosystem, user's create WatchService instance less frequently. Usually one per problem domain. But then register large directory hierarchies, etc, with that single one. But they're in control of how many ports they allocate, and how many directories they watch with each.
So, as an API suggestion that might alleviate this: support multiple Paths. Yes, FileSystemWatcher.Path is a single property. And the class takes a single value in it's ctor. But maybe there's some clean way to extend this a bit.
@wasabii Your understanding matches what I was saying in the description. However, I'm suggesting a solution that doesn't require any changes to the API. (I'm pretty sure changing the API isn't an option.)
In the solution I'm suggesting, there would be a single, global inotify
queue for all FileSystemWatcher
s (ie a class variable) instead of having a separate inotify
queue per FileSystemWatcher
(ie an instance variable) as there is at present.
As I've outlined, I think this would be relatively simple to implement. I wish I had the time to do it myself but the job I'm in now doesn't use .Net and so I can't justify it.
Description
There are many reports on the web of running into this message:
The Linux implementation of
FileSystemWatcher
creates a new thread with its own inotify instance for every directory that's watched. This is very expensive in kernel resources, and often hits system limits as can be seen from the message. It's particularly problematic when running in a container because the per-UID quota isn't namespaced and is therefore shared among all containers on the same host. To make things worse, many ASP.net applications running in a container are incorrectly configured and use moreFileSystemWatcher
s than they need to. Such a container consumes much more than its fair share of the available inotify instances, which then causes other containers in the pod (eg with k8s) to fail unexpectedly.Instead, the implementation should have a single thread with a single inotify instance that's shared by all the
FileSystemWatcher
s in the process. Instead of adding new instances (which are expensive) the implementation would add new watches (which are cheap). It would then be possible to accommodate even the most egregious of ASP.net apps with ease..Configuration
Any Linux system.
Regression?
No. This has been going on for a long time and there are a lot of puzzled blog posts about it.
Data
Blog posts:
Related issues:
27272
Analysis
The inotify system was never intended to be used like this.
However, rather than changing the implementation to use a single shared thread, it might be a better use of resources to write a new implementation usingfanotify
as suggested in:#30495