Closed rlizzo closed 6 years ago
Rick, you are my favorite profiler person ever. Thanks for investigating this. I'm tied up tonight but will get back to you asap tomorrow.
Thanks Mike! looking forward to hearing your thoughts!
That definitely seems like some caching is in order.
The primary thing to be careful about is that you cannot cache based purely on the prefix, because with multiple outputs you can have the build or host prefix be the same path, but have different contents within a single run through conda-build. However, because this is sysroot, and not arbitrary prefixes, caching should be safe - it'll be searching the same files over and over.
As a first try, I'd memoize
def _find_needed_dso_in_system(m, needed_dso, errors, sysroots, msg_prelude,
470 info_prelude, warn_prelude):
Unfortunately, because not all arguments there are hashable (especially not m), you won't be able to simply use the decorator. You'll need to refactor it so that the function doing the actual work is memoizable (i.e. all arguments are hashable).
Thanks again for looking into this. No rush - if you can't get to it, please let me know and I'll try to find time soon.
Thanks for the insight! I won't have time to dig into this till early next week, but I will definitely provide an update soon!
I'd caution against looking into this. I wrote the code but my version is now significantly diverged (many bug fixes and a whitelist feature) from the master branch and released code (due to time pressure on other tasks).
I will fix it via caching next week if you can wait, otherwise, please submit any PR to my conda-build master branch instead.
Thanks for the heads up @mingwandroid!
it's not a work-stoping concern for me right now, so I'll wait until your updates get merged into the main conda-build repo and will reprofile and update here if there are any resulting side effects. Feel free to let me know if there's anything I can do to ease a bit of the time crunch for you!
Just wanted to close out this issue since it was fixed in recent versions of conda build. I haven't done a full profile since it doesn't seem necessary, but in a quick test the post-build time has dropped from over 10 minutes to about 20 seconds. thanks for all the hard work @msarahan and @mingwandroid!
Thanks but we didn't do anything to speed it up.. well nothing deliberate anyway. The code changed a bit but not on a way that should make it much faster.
Oh well. Thanks for letting us know.
@mingwandroid Not sure if it's exactly this issue, but I'm still seeing a lot of time spent in the stage where/when those warnings pop up (not if they don't).
The warnings need to be fixed (I recommend passing -error-overlinking to conda-build).
I'm surprised you see a difference between warning and no-warning. I suspect this is between two different packages so not something that can be easily investigated.
I build many big packages and the speed of the current implementation is "OK" I feel, scaling with the package size and number of C/C++ dependencies. Sure it'd be nice if it were a little faster but I won't be spending time on that soon, unless you can provide useful, clearly bad figures to back up "a lot of time".
I suspect, for example, switching from ld
to gold
would save a lot more time.
@mingwandroid It's this one: arb-bio
After creating the various outputs:
, the pyldd
started taking a lot of time. By "a lot" I mean maybe 10 minutes total, so mostly just annoying.
Thanks for mentioning that it can be turned off! Should have checked that myself. I'm happy with that option as a fix.
(I'm just not really sure how to fix those itchy warnings...)
They really must be fixed.
Hi there, thank you for your contribution!
This issue has been automatically locked because it has not had recent activity after being closed.
Please open a new issue if needed.
Thanks!
@msarahan, With the introduction of the check-overlinking features in conda-build 3.3.0, the post-build process on MacOS took a severe performance hit. For the library I maintain, it adds 10+ minutes to the build process. I have profiled conda-build as it builds a small part of the package, and have found that there are two main functions which are resulting in 99% of the slowdown.
I am more than happy to make the appropriate changes to try and speed up code, but before starting work I wanted to reach out and get your input on what the best approach / potential challeges might be?
Function 1
In the function
_find_needed_dso_in_system
, line 479 inpost.py
:takes up 99% of the execution time
Function 2
In the function
_find_needed_dso_in_prefix
, line 436 inpost.py
takes up 96.7% of the execution time
Full Profile