ros-infrastructure / ros_buildfarm

ROS buildfarm based on Docker
Apache License 2.0
81 stars 96 forks source link

Use job-specific ccache directory. #966

Open nuclearsandwich opened 2 years ago

nuclearsandwich commented 2 years ago

Sharing ccache between builds via docker volume mounting was made optional and off by default in #844.

Part of the reasoning behind that was due to the possibility for deleterious interaction between different jobs via the host's ccache directory.

Long term, I'd actually like to explore using something like sccache to enable caching across multiple hosts but in the current state that would only widen the blast radius.

I've started with devel jobs which, along with ci jobs, are most affected by the removal of a shared ccache since each job builds once to install and once to test. If this reviews well I'll figure out the best way to spread this out among the other job types.

This would allow us to re-enable ccache for builds since each job could only affect its own ccache directory.


As an aside, I'd really like to find a way to improve the readability of the shell snippets. Joining an array of strings with newlines is a cute solution to keep things inline but the mix of quoting, escaping, and conditional logic with empy templates and python strings is courting disaster and I'd like to take the pulse on a refactor which either moves the script contents into f-strings (now that our oldest supported platform, Focal, supports them) or into separate empy files entirely.


As a further aside, I'd rather not actually use subdirectories of ~/.ccache since on a non-dedicated host (such as my local workstation) this use of ~/.ccache conflicts with the default use of that directory. I'll propose $XDG_CACHE_HOME/ros_buildfarm/ccache/$job_name.


Using rclcpp on my local workstation as a case study.

Build 1

build-and-install colcon build step: 1min 22s build-and-test colcon build step: 7min 24s

Build 2 (using the same ccache directory)

build-and-install colcon build step: 6.07s build-and-test colcon build step: 40.4s

nuclearsandwich commented 2 years ago

that goes to effectively ~/.cache if not overridden?

Yeah from the spec

$XDG_CACHE_HOME defines the base directory relative to which user-specific non-essential data files should be stored. If $XDG_CACHE_HOME is either not set or empty, a default equal to $HOME/.cache should be used.

jlblancoc commented 2 years ago

Any chances of getting this change approved? :-) It would be a much better solution to build timeouts than https://github.com/ros-infrastructure/ros_buildfarm_config/pull/213

nuclearsandwich commented 2 years ago

Any chances of getting this change approved? :-)

I don't really think it's ready for review as-is. The discussion ended on making use of XDG_CACHE_HOME but I have not implemented that yet. I'm caught up trying to find the best place to set the ccache directory on the host and handle all of the cases like the directory not yet existing or XDG_CACHE_HOME not being set and it has turned into a fair bit of work.