theochem / cgrid

C++ version of horton (2.x) grid functionality
GNU General Public License v3.0
4 stars 1 forks source link

Speed up travis by caching the conda install #8

Closed tovrstra closed 7 years ago

tovrstra commented 7 years ago

The following should be safe/needed for caching:

Not good for caching:

There may be others.

This may not be a good idea because of things not to cache. See also broadinstitute/viral-ngs#290

This is how to enable it: https://docs.travis-ci.com/user/caching/#Arbitrary-directories

Other things to consider for caching in Travis:

matt-chan commented 7 years ago

The pip cache is only for the downloads I think. I'm not sure it'll cache the install itself (it would be better if it didn't, since it's likely to interact with other packages).

I'm currently just caching all of the miniconda directory. I'll remove the offending directories in the before-cache section.

tovrstra commented 7 years ago

Pip cache is only for downloads: https://pip.pypa.io/en/latest/reference/pip_install/#caching

matt-chan commented 7 years ago

Okay, caching is implemented! It sped things up from 4min->3min in the python-cython-ci-example. Not great, but it's something I guess.

tovrstra commented 7 years ago

Good. I'll take a quick look.

tovrstra commented 7 years ago

I think the speedup is purely random. In the older runs, the download and install of miniconda just takes a few seconds. Just compare the following two and look at the variation in timing of the steps that remained the same:

With such variability, there is no point in profiling (and caching) I'm afraid.

matt-chan commented 7 years ago

Hmm is that the case? I thought there's a 50 second reduction in the time to install the conda packages in the install.5 and install.6 sections?

I'm not sure how the total comes up to 3 minutes though on the cached job. A quick addition on https://travis-ci.org/theochem/python-cython-ci-example/jobs/270785980 doesn't even come up to 1 minute. Are there parts which are not being profiled properly?

On Fri, 1 Sep 2017 at 13:41 Toon Verstraelen notifications@github.com wrote:

I think the speedup is purely random. In the older runs, the download and install of miniconda just takes a few seconds. Just compare the following two and look at the variation in timing of the steps that remained the same:

- https://travis-ci.org/theochem/python-cython-ci-example/jobs/270785980

https://travis-ci.org/theochem/python-cython-ci-example/jobs/269529284

With such variability, there is no point in profiling (and caching) I'm afraid.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/theochem/qcgrids/issues/8#issuecomment-326559531, or mute the thread https://github.com/notifications/unsubscribe-auth/AA_-NZWL0zzNR1APS0ajqkl30ub3PSKYks5sd-15gaJpZM4PI3d2 .

-- Matt

Sent from my phone

tovrstra commented 7 years ago

I'd guess CPU time versus wall time. Not sure.

tovrstra commented 7 years ago

Can we just postpone this for a while? This is not such an issue and it quickly clobbers the CI scripts.

matt-chan commented 7 years ago

Sure. We can work on splitting the builds based on tags first.

On Fri, 1 Sep 2017 at 14:48 Toon Verstraelen notifications@github.com wrote:

Can we just postpone this for a while? This is not such an issue and it quickly clobbers the CI scripts.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/theochem/qcgrids/issues/8#issuecomment-326571472, or mute the thread https://github.com/notifications/unsubscribe-auth/AA_-NXMlzVMTyn5fMtGq7o60Nc_L-7VIks5sd_0ugaJpZM4PI3d2 .

-- Matt

Sent from my phone

tovrstra commented 7 years ago

@matt-chan It seems that caching is fully working! So we can close this issue I guess. There is just one small thing: the .travis.yml script is not self-contained at the moment due to one of the caching steps. (This is a source of mistakes when copying over .travis.yml files to other projects.)

matt-chan commented 7 years ago

Okay, #13 should fix this?

I'm going to leave the issue open a bit more because I want to make sure our cache isn't being invalidated by spurious changes.

tovrstra commented 7 years ago

The current .travis.yml installs pip packages into the conda env, which gets then cached. This is not ideal because they will not get update once cached. We could do pip install --user --upgrade ... to avoid this issue. Pip caching can be made efficient as shown here: https://github.com/nickstenning/travis-pip-cache

That still has the disadvantage that wheels may accumulate over time. We have a wheel cleaning script in the HORTON repo to get rid of them: https://github.com/theochem/horton/blob/master/tools/qa/remove_old_wheels.py

matt-chan commented 7 years ago

It doesn't need to be in --user. I think running it with --upgrade should be enough. The miniconda cache will still be updated if there's a new package

On Sun, 3 Sep 2017 at 01:00 Toon Verstraelen notifications@github.com wrote:

The current .travis.yml installs pip packages into the conda env, which gets then cached. This is not ideal because they will not get update once cached. We could do pip install --user --upgrade ... to avoid this issue. Pip caching can be made efficient as shown here: https://github.com/nickstenning/travis-pip-cache

That still has the disadvantage that wheels may accumulate over time. We have a wheel cleaning script in the HORTON repo to get rid of them: https://github.com/theochem/horton/blob/master/tools/qa/remove_old_wheels.py

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/theochem/qcgrids/issues/8#issuecomment-326773843, or mute the thread https://github.com/notifications/unsubscribe-auth/AA_-NeJLrQHQafE3OUqxr652I4D1g0Amks5sed4ZgaJpZM4PI3d2 .

-- Matt

Sent from my phone

tovrstra commented 7 years ago

This is fixed.

dhimmel commented 6 years ago

I'm interested in speeding up our Travis builds that use conda environments. We specify our environment using an environment.yml, which is currently reinstalled every build. The part of the install process that takes the longest is Solving package specifications triggered by conda env create.

I just wanted to confirm that your caching method discussed above alleviates this "Solving package specifications" step (it seems like it)? Also if so, is https://github.com/theochem/qcgrids/commit/157318c293ac05312ddc4864ad669710903c7712 the crucial commit to enable conda caching on Travis? Sorry if this is a bit off topic... this is the most relevant issue I could find!

matt-chan commented 6 years ago

Hi Daniel,

I'd actually take the Travis.yml on master, since the relevant changes were split over several commits.

The lines of interest are: 46-54 and 61-85.

The caveat is that you must manually ensure your conda environment is consistent. If you remove anything from your meta(environment).yml, you must invalidate the cache on Travis, or you must uninstall it in the Travis yml. Adding packages and updating them is unaffected.

The before_cache lines will make sure you aren't rebuilding your cache (takes about a minute or two) unnecessarily (those directories change when you do a conda build).

Take care! Matt -- Matt

Sent from my phone

dhimmel commented 6 years ago

Thanks @matt-chan!

The lines of interest are: 46-54 and 61-85.

For posterity / convenience, I've pasted those snippets below:

https://github.com/theochem/qcgrids/blob/ac5d1e263b8f3fe0e659f908ab8bc849970ec24f/.travis.yml#L46-L54

https://github.com/theochem/qcgrids/blob/ac5d1e263b8f3fe0e659f908ab8bc849970ec24f/.travis.yml#L61-L85

If you remove anything from your meta(environment).yml, you must invalidate the cache on Travis, or you must uninstall it in the Travis yml.

I wonder if there's a way to see if the changeset that's being built modified environment.yml, and if so wipe the cache. Referencing https://github.com/nest/nest-simulator/pull/75.

tovrstra commented 6 years ago

On Fri, Oct 13, 2017 at 4:56 PM Daniel Himmelstein notifications@github.com wrote:

I wonder if there's a way to see if the changeset that's being built modified environment.yml, and if so wipe the cache.

You can detect changes in a file easily in a PR on Travis. Travis works on a merge of the PR and the branch being merged into. The error code of the following may be used to detect a change in any given file, here environment.yml

git diff $TRAVIS_BRANCH --stat environment.yml

I'm not sure what would be the best way to wipe the cache inside a .travis.yml file. You can obviously do a manual rm.