Closed cottsay closed 4 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 83.54%. Comparing base (
6cf24ea
) to head (69e20f9
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I think we can do better than this, actually.
I didn't know this, but the importlib.metadata
API for distributions and entry_points does absolutely no caching at all, to the point where even accessing properties on distribution objects typically results in reading the metadata from disk every time. I did some fooling around and got my previous 0.4s to under 0.3s by specifically caching the underlying metadata to avoid the disk reads. Some strategic structuring of that underlying data to avoid iterating over it might yield even more savings.
I didn't realize that the startup performance had regressed so badly. SSDs and OS caching hide how much IO is happening here. I can imagine that cold invocations on spinning disks are brutal...
Alright, I dropped the lru_cache
stuff in favor of an explicit cache. This change brought baseline loading from 0.8s to 0.3s on my machine. Pyflame looks a lot better now.
Whenever we enumerate Python entry points to load colcon extension points, we're re-parsing metadata for every Python package found on the system. Worse yet, accessing attributes on importlib.metadata.Distribution typically results in re-reading the metadata each time, so we're hitting the disk pretty hard.
We don't generally expect the entry points available to change, so we should cache that information once and parse each package's metadata a single time.
Closes #600