Closed PeterBowman closed 6 years ago
Another example of caching (CMake + YARP): roboticslab-uc3m/project-generator.
BTW those larger blocks of YAML code (example) could be better placed as standalone shell scripts in scripts/travis/
.
Build times have been cut by a half thanks to https://github.com/roboticslab-uc3m/kinematics-dynamics/compare/9bc689c...e21895c (cache YARP + KDL deps). Two issues arise:
LD_LIBRARY_PATH
and YARP_DATA_DIRS
must be correctly set so that YARP's module loader finds all necessary devices on runtime.dlopen
function does not cope well with this setup. I had to compile YARP with ACE in order to solve that, see https://github.com/roboticslab-uc3m/installation-guides/issues/13#issuecomment-374569031.sudo
s led me to think about container-based build on Travis. Sadly, libace-dev
has not been whitelisted (yet?), see https://github.com/roboticslab-uc3m/installation-guides/issues/13#issuecomment-374571984.I'd like to solve https://github.com/roboticslab-uc3m/installation-guides/issues/13 first, but not marking as blocked until further investigation.
As spoken with @jgvictores, let's add another shell script for use by Travis that would extract/deduce the time of creation of the latest cache, then invalidate an regenerate it if older than, say, one week. A new cron job launched once per week would perform this sequence again so that we don't need to wait for the next push build (which may or may not happen in a long time). Also, let's clone and store latest YARP master in this cache instead of a specific release.
Regarding dependencies, caching and containerization: https://github.com/robotology/yarp/issues/1625.
I had to compile YARP with ACE in order to solve that
Less sudos led me to think about container-based build on Travis. Sadly, libace-dev has not been whitelisted (yet?)
No more ACE, and the only relevant sudo
left in the whole YAML file is this one. See #50.
let's add another shell script for use by Travis that would extract/deduce the time of creation of the latest cache, then invalidate an regenerate it if older than, say, one week
Alternatively, check the latest remote commit against a cached copy, then only invalidate and regenerate the cache if not the last one. Would need a clone operation (other option involves a SSH connection per this SO answer), but it could improve times significantly when working with non-tagged dependencies such as OpenRAVE.
Edit: not so straightforward with GitHub, requires adding SSH keys.
Edit 2 (note to self): clone only specific branches (e.g. master
and devel
, specified in the /scripts/cache-<pkg>.sh
) with --depth=1
, don't invalidate cache if same revision as the one stored. Extensible to our own repos, e.g. yarp-devices! Keep using wget
with tagged releases (e.g. v2.3.72.1
).
Notes:
matrix.include
is, apparently, equivalent to jobs.include
(blog entry, docs).matrix:
include:
- if: type = cron
compiler: gcc
env: ROBOTOLOGY_CHECKOUT=devel
- if: type = cron
compiler: clang
env: ROBOTOLOGY_CHECKOUT=devel
instead of (will only run one job, assuming there is a compiler:
line somewhere else):
matrix:
include:
- if: type = cron
env: ROBOTOLOGY_CHECKOUT=devel
TL;DR: The compiler
attribute is not recognized by the conditions framework (ref).
matrix.exclude
don't cope well together.Conditions and matrix.exclude don't cope well together.
Per https://blog.travis-ci.com/2017-09-12-build-stages-order-and-conditions:
Jobs need to be listed explicitely, i.e. using
jobs.include
ormatrix.include
(alias), in order to specify conditions for them. Jobs created via matrix expansion currently cannot have conditions.
I have refactored the caching script (now it's a single .sh file that accepts several options via CLI) and applied it to all kin-dyn's dependencies at https://github.com/roboticslab-uc3m/kinematics-dynamics/commit/fe405d3c0682a01087a93f220f87eb6e24edfc60.
Notes:
Still WIP and Travis builds are failing (ref).
Cron jobs will test YARP's devel branch additionally to what usual push/pr/api jobs do. Since YARP's master and release jobs are already covered, I'd rather omit them on cron, but it's not that simple (see previous comments).
Said script uses wget
to retrieve archived releases, e.g. YARP's v2.3.70.2 tag. I don't know how to make it work with the SSH protocol. This is currently used only for interaction with the amor-api repository, but we are fine as long as we pull latest develop
(perhaps master
in the future AMOR v1.0.0 API) only.
Since YARP's master and release jobs are already covered, I'd rather omit them on cron, but it's not that simple (see previous comments).
In fact, YARP's master branch builds would make sense once per week, so keep them. Perhaps I could configure Travis in such a way that release builds (e.g. YARP 2.3.70.2, 2.3.72.1, etc.) are added with matrix.include
and if: type != cron
, thus they wouldn't be considered for current weekly cron builds.
Edit: however, other dependencies (e.g. KDL, OpenRAVE, our RL projects...) might respond differently to distinct YARP releases. It would make sense to test everything on weekly cron builds, then.
Idea: move Travis-related bash scripts (in scripts/travis/
) to a separate repo (perhaps even a gist?) and clone/download it on every build.
Remark: ccache (official page) does not seem to improve build times. In fact, they are significantly worse on the first (cold) run, and perhaps a bit longer than usual on successive (warm) runs.
Some updates on this:
Builds have been fixed and caching works like a charm: mean build times have been reduced from ca. 7m40s to ~2m, overall times from ~15m to ~4m, and aggregated times from ~45m to ~13m: no cache, with cache.
Each job generates and restores its own cache storage, which means that the job YARP_CHECKOUT=master
+gcc does not "see" the cache of YARP_CHECKOUT=v2.3.70.2
+clang, for instance.
The previous point means that common caches (e.g. latest KDL-master builds on gcc) are not recicled across similar jobs. Instead, respective copies will be stored in the global (per branch) cache and regenerated several times per job on any upstream change.
Cache sizes will eventually grow larger with every dropped YARP release. That is, we support YARP 2.3.70.2 today, and will enhance the build matrix with new YARP 3.x releases. However, after removing the YARP 2.3.70.2 builds in the future, the corresponding cache directories will probably persist. Such cases should be handled manually by invalidating/deleting the whole cache.
Idea: move Travis-related bash scripts (in scripts/travis/) to a separate repo (perhaps even a gist?) and clone/download it on every build.
Alternatively, use git submodules that point to a branch (usually, they point at a specific commit): https://www.activestate.com/blog/2014/05/getting-git-submodule-track-branch.
See also https://git-scm.com/docs/gitmodules.
In recent versions of Git, specific branches can be tracked - there is a convenient branch
property in .gitmodules
for such purpose. With the new --remote
option to git submodule update
, Git will fetch the latest HEAD commit of said branch and merge it into the current submodule tree (see docs regarding the update
command). Still, the hosting repository will track a specific commit instead, which is what I wanted to avoid or circumvent. My original intention was to decouple somehow the revision history of this internal repository and let Travis fetch whatever commit was last at that time. It doesn't make much sense on a second thought, though.
On the other hand, gists can be either cloned (full history) or downloaded (specific or latest commit). However, they must live in a user account.
Edit: check my submodule experiments at https://github.com/roboticslab-uc3m/kinematics-dynamics/commit/f7f22f2006430b7451ca320d14d075bf1c364648 (Travis CI).
Looking at https://github.com/roboticslab-uc3m/kinematics-dynamics/commit/f7f22f2006430b7451ca320d14d075bf1c364648 and not fully understanding everything... I guess https://github.com/PeterBowman/travis-scripts is something that existed at some time?
Yes, I just moved the contents of scripts/travis/
in there. The tests proved that Travis will not update the HEAD commit the submodule points to unless changed manually, which I hoped to avoid by setting a tracking branch in the .gitmodules
file.
Let's speak about this f2f tomorrow.
As spoken with @jgvictores, and as a better alternative to using Gists, we could move those scripts to a separate repo (just as I did) and treat it as another hard dependency. Then, Travis would process it in the before_install
section (either git clone or wget a branch).
Since this is not a priority, (harmless) duplicates of said scripts will be created for now.
For future reference (on using set -e
in shell scripts sourced by Travis): https://github.com/roboticslab-uc3m/tools/commit/2f0fddb96202a20feadc9546e8916e86de9dea34.
@jgvictores would you like to have this enabled on xgnitive, too? (30 minutes per job ATOW)
That would be very very nice!
@jgvictores would you like to have this enabled on xgnitive, too? (30 minutes per job ATOW)
Done at https://github.com/roboticslab-uc3m/xgnitive/commit/8c2f26c9724e16a3b159816c583582f6b6a17ed5. BTW the Find<pkg>.cmake
scripts were mostly broken and I had to update the installation guides, too: https://github.com/roboticslab-uc3m/installation-guides/commit/1fa60070041afefb183db710a634be2f0bbfe59a. I'm applying the same policy as everywhere else (testing against YARP 2.3.70/2.3.72/master + YARP devel on weekly cron jobs), please adjust it to your needs.
Done on all Travis-enabled repos except gait. Final notes:
scripts/travis/
will do good.scripts/admin/
(similarly to the doxygen one) might invalidate the cache for said branches on a periodic manner..travis.yml
). I wonder if Travis itself invalidates the storage from time to time. Otherwise, another cron script could take care of this (see previous point).As mentioned in earlier comments, if support is dropped for previous dependency releases, their associated cache directory will remain on server disks. Therefore, the entire cache should be invalidated (e.g. we no longer want to perform tests on YARP 2.3.70, so we remove that line from
.travis.yml
). I wonder if Travis itself invalidates the storage from time to time. Otherwise, another cron script could take care of this (see previous point).
The cache is invalidated after 28 days (ref):
Cache archives are currently set to expire after 28 days for open source projects and 45 days for private projects. This means a specific cache archive will be deleted if it wasn’t changed after its expiration delay.
Anyway, that policy is enforced for dangling/old branches, I'm not sure whether the cache files for an active branch (say, develop) would be deleted if not changed - even if used regularly (cron jobs). Manual invalidation is still recommended in the aforementioned case, see https://github.com/roboticslab-uc3m/questions-and-answers/issues/83.
Thanks to #17, weekly cron jobs automatically test latest commits on production against YARP's
devel
branch. Currently, YARP needs to be cloned and built on every push action, and so do other dependencies (e.g. KDL, PCL, OpenRAVE).Proposal: for non-cron jobs (perhaps cron too if it makes sense), store YARP (and other dependencies if present) in the Travis cache (docs) in order to speed up things. It's preferrable to hold a global variable with the targeted version so that it can be easily bumped in the YAML file with every new release. Speaking of YARP, current
master
is tagged at 2.3.72.Caching has been extensively tested on https://github.com/asrob-uc3m/robotDevastation, see .travis.yml and latest builds.