Title: inotify watcher is O(N) of watches for each file change
Description:
When using the filesystem watcher for EDS configuration, I noticed significant performance overhead for configuration changes. With a large number of clusters (i've tested with ~1000) updating somewhat frequently (I've tested updating them all every 10s), Envoy ends up spending a significant portion of its time simply responding to inotify events. I think i've narrowed it town to a few factors at play.
The inotify watcher watches the directory instead of the file itself. This means that what should be O(1) for a single file change, ends up being O(N) since any file change will trigger all watchers for that directory.
In addition to the previous point, watching the directory causes even non-watched files to trigger inotify events. This was particularly bad in my case because we were writing temporary files into the same directory
The inotify watcher watches IN_ALL_EVENTS, which includes irrelevant events like IN_ACCESS.
I've managed to work around this by having each EDS configuration in its own directory, not writing temp files in those directories, and minimizing accesses and other file operations on these files.
I have not tried this using the kqueue watcher.
Repro steps:
Create a large number of clusters with an eds_config file for each:
# clusters.yaml jinja template
resources:
{%- for service in services %}
- "@type": "type.googleapis.com/envoy.api.v2.Cluster"
name: {{ service }}.cluster
type: EDS
eds_cluster_config:
eds_config:
path: /tmp/xds/endpoints/{{ service }}.yaml
{%- endfor %}
You then just need a basic endpoint file for each cluster (you don't even need endpoints)
# {{ service }}.yaml
resources:
- "@type": type.googleapis.com/envoy.api.v2.ClusterLoadAssignment
cluster_name: {{ service }}
endpoints:
- lb_endpoints:
I was recreating all of these files every 10 seconds or so. Clearly this is the wrong thing to do, but for the first version, it was sufficient.
I've uploaded a pprof callgraph from the cpuprofiler output here. It's in svg format, so you'll need to open the image (e.g. on Chrome, right click and hit "show image in new tab")
Title: inotify watcher is O(N) of watches for each file change
Description: When using the filesystem watcher for EDS configuration, I noticed significant performance overhead for configuration changes. With a large number of clusters (i've tested with ~1000) updating somewhat frequently (I've tested updating them all every 10s), Envoy ends up spending a significant portion of its time simply responding to inotify events. I think i've narrowed it town to a few factors at play.
IN_ALL_EVENTS
, which includes irrelevant events likeIN_ACCESS
.I've managed to work around this by having each EDS configuration in its own directory, not writing temp files in those directories, and minimizing accesses and other file operations on these files.
I have not tried this using the
kqueue
watcher.Repro steps: Create a large number of clusters with an eds_config file for each:
You then just need a basic endpoint file for each cluster (you don't even need endpoints)
I was recreating all of these files every 10 seconds or so. Clearly this is the wrong thing to do, but for the first version, it was sufficient.
I've uploaded a pprof callgraph from the cpuprofiler output here. It's in svg format, so you'll need to open the image (e.g. on Chrome, right click and hit "show image in new tab")