Lambkin base image doesn't cache properly

serraramiro1 commented 1 year ago

Bug description

As the title states, a change to any file of lambkin repository, will invalidate the most expensive step of the building process, and thus make the user spend more than an hour rebuilding it. (Also related to #62 ). This is originated by the way we're mouting the repository for installation.

Snippet from root Dockerfile:

RUN --mount=type=bind,source=.,target=/tmp/context,readonly pyenv shell 3.9 && \
    cd /tmp/context && ansible-galaxy collection install .ansible && \
    ansible-playbook -u $USERNAME -e "repository_uri=file:///tmp/context" \
    -e "${configuration}" ${ansible_flags} ekumenlabs.lambkin.local_runner

Platform:

OS: Ubuntu focal
Python version: 3.8
lambkin version: 11312538a7ae3fa3155ca28de0fc64f7e6b6517a

How to reproduce

1- Cd to lambkin's root directory. 2- Build some base docker image.

docker buildx bake lambkin-ubuntu-focal

3- Create a random file

touch throwaway

4- Run the build command again

docker buildx bake lambkin-ubuntu-focal

Expected behavior

The build immediately finishes since it hit the cache

Actual behavior

The build will start from the step mentioned above.

Additional context

A possible way of tracking package dependencies changes without tracking everything, is only copy package.xml files from the repo, ignoring every other file that rosdep doesn't care about. @nahueespinosa for context.

hidmic commented 1 year ago

A possible way of tracking package dependencies changes without tracking everything, is only copy package.xml files from the repo, ignoring every other file that rosdep doesn't care about.

That presumes that every package in LAMBKIN will be a ROS package. That's an unnecessary requirement. lambkin-shepherd isn't, nor it needs to be.

The problem lies, as you point out, in the integration between docker and ansible. We use docker images as our binary distribution format. We use ansible for arbitrarily complex image provisioning. But ansible is bound to a single image layer, and since docker can't know what will ansible do with package sources, it is forced to invalidate that image layer whenever anything changes.

The ideal solution would be to split every ansible task into its own image layer. That's what ansible-bender is supposed to do. Alternatively, we could drop ansible and use a different mechanism for image provisioning. I don't know of any better suited tool.

hidmic commented 1 year ago

We're bringing the big guns (aka @erasmomontes) to solve this fun containerized distribution problem.

Ekumen-OS / lambkin