colcon / colcon-core

Command line tool to build sets of software packages
http://colcon.readthedocs.io
Apache License 2.0
99 stars 44 forks source link

Add an explicit cache on Python entry points #614

Closed cottsay closed 4 months ago

cottsay commented 5 months ago

Whenever we enumerate Python entry points to load colcon extension points, we're re-parsing metadata for every Python package found on the system. Worse yet, accessing attributes on importlib.metadata.Distribution typically results in re-reading the metadata each time, so we're hitting the disk pretty hard.

We don't generally expect the entry points available to change, so we should cache that information once and parse each package's metadata a single time.

Closes #600

codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 83.54%. Comparing base (6cf24ea) to head (69e20f9).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #614 +/- ## ========================================== + Coverage 83.34% 83.54% +0.20% ========================================== Files 66 66 Lines 3794 3816 +22 Branches 739 745 +6 ========================================== + Hits 3162 3188 +26 + Misses 557 554 -3 + Partials 75 74 -1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

cottsay commented 5 months ago

I think we can do better than this, actually.

I didn't know this, but the importlib.metadata API for distributions and entry_points does absolutely no caching at all, to the point where even accessing properties on distribution objects typically results in reading the metadata from disk every time. I did some fooling around and got my previous 0.4s to under 0.3s by specifically caching the underlying metadata to avoid the disk reads. Some strategic structuring of that underlying data to avoid iterating over it might yield even more savings.

I didn't realize that the startup performance had regressed so badly. SSDs and OS caching hide how much IO is happening here. I can imagine that cold invocations on spinning disks are brutal...

cottsay commented 5 months ago

Alright, I dropped the lru_cache stuff in favor of an explicit cache. This change brought baseline loading from 0.8s to 0.3s on my machine. Pyflame looks a lot better now.