Continuing on my profiling campaign, I checked SmvApp.get_graph_json for hotspots. It turns out that for large projects we spend over 80% of our time in pkgutil.walk_packages, which we call for each module. For a 1000 module project, get_graph_json costs about 90 seconds. Simply caching the result of walk_packages drops this to about 10 seconds. We should be able to cache this because the result shouldn't change within a transaction. It may not even be necessary to cache it though - we hit it for each module so in order to check if the file it should be found in actually exists, but it's not clear why we can't just try to import the file and assume it doesn't exist in case of import error.
Continuing on my profiling campaign, I checked
SmvApp.get_graph_json
for hotspots. It turns out that for large projects we spend over 80% of our time inpkgutil.walk_packages
, which we call for each module. For a 1000 module project,get_graph_json
costs about 90 seconds. Simply caching the result ofwalk_packages
drops this to about 10 seconds. We should be able to cache this because the result shouldn't change within a transaction. It may not even be necessary to cache it though - we hit it for each module so in order to check if the file it should be found in actually exists, but it's not clear why we can't just try to import the file and assume it doesn't exist in case of import error.