inotify watcher performance issues

Title: inotify watcher is O(N) of watches for each file change

Description: When using the filesystem watcher for EDS configuration, I noticed significant performance overhead for configuration changes. With a large number of clusters (i've tested with ~1000) updating somewhat frequently (I've tested updating them all every 10s), Envoy ends up spending a significant portion of its time simply responding to inotify events. I think i've narrowed it town to a few factors at play.

The inotify watcher watches the directory instead of the file itself. This means that what should be O(1) for a single file change, ends up being O(N) since any file change will trigger all watchers for that directory.
In addition to the previous point, watching the directory causes even non-watched files to trigger inotify events. This was particularly bad in my case because we were writing temporary files into the same directory
The inotify watcher watches IN_ALL_EVENTS, which includes irrelevant events like IN_ACCESS.

I've managed to work around this by having each EDS configuration in its own directory, not writing temp files in those directories, and minimizing accesses and other file operations on these files.

I have not tried this using the kqueue watcher.

Repro steps: Create a large number of clusters with an eds_config file for each:

# clusters.yaml jinja template
resources:
{%- for service in services %}
- "@type": "type.googleapis.com/envoy.api.v2.Cluster"
  name: {{ service }}.cluster
  type: EDS
  eds_cluster_config:
    eds_config:
      path: /tmp/xds/endpoints/{{ service }}.yaml
{%- endfor %}

You then just need a basic endpoint file for each cluster (you don't even need endpoints)

# {{ service }}.yaml
resources:
- "@type": type.googleapis.com/envoy.api.v2.ClusterLoadAssignment
  cluster_name: {{ service }}
  endpoints:
  - lb_endpoints:

I was recreating all of these files every 10 seconds or so. Clearly this is the wrong thing to do, but for the first version, it was sufficient.

I've uploaded a pprof callgraph from the cpuprofiler output here. It's in svg format, so you'll need to open the image (e.g. on Chrome, right click and hit "show image in new tab")

envoyproxy / envoy

inotify watcher performance issues #3308

6215 reduces the set of watched inotify events from `IN_ALL_EVENTS` to just those events that are needed.

envoyproxy / envoy

inotify watcher performance issues #3308

6215 reduces the set of watched inotify events from IN_ALL_EVENTS to just those events that are needed.

6215 reduces the set of watched inotify events from `IN_ALL_EVENTS` to just those events that are needed.