speed up slow build times for ./configure based projects

jonas-kaufmann commented 1 year ago

I have been investigating https://github.com/se-sic/VaRA/issues/1003. Concretely, I profiled the import time of the script benchbuild/res/wrapping/run_compiler.py.inc. I found that, at least for the VaRA-Tool-Suite, a substantial amount of time is being spent on importing plugins in the form of other projects and experiments which are not actually being used. So the main idea is to prevent that by overriding the benchbuild configuration through environment variables. Together with some other changes on VaRA's side, I managed to achieve about a 2x performance improvement in my tests.

The other thing that might give a speed improvement in the future is to use importlib.metadata to query the installed benchbuild version instead of pkg_resources. pkg_resources takes almost 100 ms on my machine to import, while importlib.metadata is negligible. https://github.com/PolyJIT/benchbuild/commit/1ae6fdbbfd660fc4fd17765987aebf9da90dc834 won't do anything at the moment though because there are at least two third-party packages that also use pkg_resources to set __version__.

In my opinion, the change to benchbuild/res/wrapping/run_compiler.py.inc should be completely transparent to the user and the necessary imports are still made when unpickling the project/compiler. @vulder @boehmseb @simbuerg What do you think about this? Am I missing some detail where this might not work in certain cases?

jonas-kaufmann commented 1 year ago

This is the import time profile after all optimizations. As you can see, most of the time on the benchbuild side is spent on importing packages related to the database. I am not too familiar with all the code but to me it seems that this can't really be reduced since the database is a core feature of benchbuild.

simbuerg commented 1 year ago

Thanks for working on this!

simbuerg commented 1 year ago

I never thought about it for long. But I think the sqlalchemy support can be removed "fairly" easily. The only real deep connection to the rest of benchbuild should be in benchbuild.utils.run.RunInfo and benchbuild.utils.actions. For a bit of perspective: sqlalchemy.migrate will go as soon as I finally get to unbreaking the github CI, because it blocks me from upgading to a newer sqlalchemy version.

The rest should work as you implemented it. Earlier versions of benchbuild relied on the standard pickle module. This relied on being able to have all necessary imports present in the unpickling python interpreter.

jonas-kaufmann commented 1 year ago

Oh, I just saw https://github.com/PolyJIT/benchbuild/pull/549. I tested again with sqlalchemy.migrate removed. This yields another nice speed-up and also means that https://github.com/PolyJIT/benchbuild/commit/1ae6fdbbfd660fc4fd17765987aebf9da90dc834 actually has the intended effect.

jonas-kaufmann commented 1 year ago

I never thought about it for long. But I think the sqlalchemy support can be removed "fairly" easily. The only real deep connection to the rest of benchbuild should be in benchbuild.utils.run.RunInfo and benchbuild.utils.actions. For a bit of perspective: sqlalchemy.migrate will go as soon as I finally get to unbreaking the github CI, because it blocks me from upgading to a newer sqlalchemy version.

I had a look again and it's definitely pretty easy to disable sqlalchemy support as well, thanks for your input. My idea is to add a switch for disabling the database in the benchbuild config. I am going to clean up my code so far and then push that into this PR as well.

jonas-kaufmann commented 1 year ago

Force pushing really wasn't the call, anyway, now the damage is done.

Adding the configuration option and the respective if guards for db/transaction code was pretty straightforward. Below is also the import time profile after all these optimizations. For the VaRA-Tool-Suite, we are now at a ~4-5x faster invocation time for the wrapped compiler.

simbuerg commented 1 year ago

Ignore CI errors with doc steps. This is the ancient package 'pheasant' being out-of-date for python 3.10.

I will replace it with sphinx in a future PR.

jonas-kaufmann commented 1 year ago

Ok. There was one last thing that was looking weird in the profile and that was the amount of time spent in benchbuild.settings to parse the config. The reason for that is that yaml.load() is slow because yaml seems to be a complex language to parse. I was able to reduce the time necessary though by switching to the C-based yaml loader and by removing unecessary loads in init_from_env().

This is the current state now. I don't see anything else weirdly taking a lot of time in the profile anymore. If you don't have any more ideas either, we can merge this from my side.

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 20.87% and project coverage change: -2.70 :warning:

Comparison is base (480223c) 52.71% compared to head (5f21147) 50.01%.

:exclamation: Current head 5f21147 differs from pull request most recent head eabd76a. Consider uploading reports for the commit eabd76a to get more accurate results

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #555 +/- ## ========================================== - Coverage 52.71% 50.01% -2.70% ========================================== Files 120 120 Lines 8265 8305 +40 Branches 1038 1063 +25 ========================================== - Hits 4357 4154 -203 - Misses 3716 3960 +244 + Partials 192 191 -1 ``` | [Impacted Files](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT) | Coverage Δ | | |---|---|---| | [benchbuild/project.py](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT#diff-YmVuY2hidWlsZC9wcm9qZWN0LnB5) | `56.20% <0.00%> (-0.55%)` | :arrow_down: | | [benchbuild/settings.py](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT#diff-YmVuY2hidWlsZC9zZXR0aW5ncy5weQ==) | `100.00% <ø> (ø)` | | | [benchbuild/utils/actions.py](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT#diff-YmVuY2hidWlsZC91dGlscy9hY3Rpb25zLnB5) | `51.06% <0.00%> (-4.98%)` | :arrow_down: | | [benchbuild/utils/db.py](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT#diff-YmVuY2hidWlsZC91dGlscy9kYi5weQ==) | `15.95% <0.00%> (-44.69%)` | :arrow_down: | | [benchbuild/utils/revision\_ranges.py](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT#diff-YmVuY2hidWlsZC91dGlscy9yZXZpc2lvbl9yYW5nZXMucHk=) | `37.59% <0.00%> (-1.34%)` | :arrow_down: | | [benchbuild/utils/run.py](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT#diff-YmVuY2hidWlsZC91dGlscy9ydW4ucHk=) | `29.31% <0.00%> (-1.29%)` | :arrow_down: | | [benchbuild/extensions/compiler.py](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT#diff-YmVuY2hidWlsZC9leHRlbnNpb25zL2NvbXBpbGVyLnB5) | `40.00% <25.00%> (+0.52%)` | :arrow_up: | | [benchbuild/extensions/run.py](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT#diff-YmVuY2hidWlsZC9leHRlbnNpb25zL3J1bi5weQ==) | `50.00% <33.33%> (+0.98%)` | :arrow_up: | | [benchbuild/extensions/time.py](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT#diff-YmVuY2hidWlsZC9leHRlbnNpb25zL3RpbWUucHk=) | `35.29% <33.33%> (-0.19%)` | :arrow_down: | | [benchbuild/utils/wrapping.py](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT#diff-YmVuY2hidWlsZC91dGlscy93cmFwcGluZy5weQ==) | `23.85% <33.33%> (-1.84%)` | :arrow_down: | | ... and [3 more](https://codecov.io/gh/PolyJIT/benchbuild/pull/555?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT) | | ... and [2 files with indirect coverage changes](https://codecov.io/gh/PolyJIT/benchbuild/pull/555/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT) Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=PolyJIT)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

PolyJIT / benchbuild

speed up slow build times for ./configure based projects #555

Codecov Report