autowarefoundation / autoware

Autoware - the world's leading open-source software project for autonomous driving
https://www.autoware.org/
Apache License 2.0
8.58k stars 2.88k forks source link

feat(ci,docker): change `vcs import` to `COPY src` #4738

Closed youtalk closed 1 month ago

youtalk commented 1 month ago

Description

This PR modifies the process by running vcs import on outside the Dockerfile and then copying the src to the Dockerfile. This change resolves the issues https://github.com/autowarefoundation/autoware/pull/4730#issuecomment-2116788839 and ensures that docker build is correctly rerun when the source code is modified.

I checked the Autoware Documentation and found that no documentation changes were necessary due to this modification. https://autowarefoundation.github.io/autoware-documentation/main/installation/autoware/docker-installation/

Tests performed

Not applicable.

Effects on system behavior

Not applicable.

Interface changes

Pre-review checklist for the PR author

The PR author must check the checkboxes below when creating the PR.

In-review checklist for the PR reviewers

The PR reviewers must check the checkboxes below before approval.

Post-review checklist for the PR author

The PR author must check the checkboxes below before merging.

After all checkboxes are checked, anyone who has write access can merge the PR.

youtalk commented 1 month ago

@mitsudome-r @oguzkaganozt @xmfcx Please review this.

oguzkaganozt commented 1 month ago

Also after merging this PR, we would need to test docker-build-and-push with this PR's branch then we can actually test out the build mechanism with this new caching. Otherwise both are looking great thanks @youtalk !

youtalk commented 1 month ago

@oguzkaganozt Thank you for your review! I'm checking docker-build-and-push https://github.com/autowarefoundation/autoware/actions/runs/9174270335.

youtalk commented 1 month ago

no-cuda was success but cuda was failed.

https://github.com/autowarefoundation/autoware/actions/runs/9174270335/job/25224777460#step:7:16258

ERROR: failed to solve: failed to copy to tar: rpc error: code = Unknown desc = write /tmp/devel.tar: no space left on device

I think it was probably exceeding the disk size when saving the container image as a tar file. I'll consider countermeasures.

youtalk commented 1 month ago

I hope the problem would be fixed. https://github.com/autowarefoundation/autoware/pull/4738/commits/6a9398d3cf43100aea729c4a3eed61147ecbf9ed I'm checking docker-build-and-push again. https://github.com/autowarefoundation/autoware/actions/runs/9183845791

@oguzkaganozt By the way, why is the Upload Artifact step necessary to verify the docker build? It seems sufficient if Build and Save Artifacts is successful. Since more than 10 GB needs to be uploaded, it also seems to be taking time to transfer.

doganulus commented 1 month ago

LGTM, but we need to change the documentation of building containers from scratch section because the user would need to dovcs import before building the containers after merging this PR.

This is exactly why you should not merge this PR. The whole idea of containers is to reduce dependencies to streamline build and deployment. The user must be able to build using standard container tooling only.

Please do not introduce more technical debt for nothing.

mitsudome-r commented 1 month ago

@oguzkaganozt

Please do not introduce more technical debt for nothing.

The key motivation behind this PR is that @youtalk is currently seeking for a method to fasten Docker build process so that we can utilize docker images for developing latest features of Autoware. He's trying to use docker build cache, but since it won't detect the change in the code retrieved from vcs import, he proposed to introduce this workaround.

Therefore, there is a trade-off between docker build time and complexity of docker build procedure at the moment. If we just think vcs import as steps for downloading Autoware source code, I don't think it's that unnatural to ask users to download the necessary codes before running docker build.

Alternatively, we could use Git submodule to simplify the command itself, but that would make a too big of a impact at this stage compared to the benefits we get.

@doganulus If you can think of any good approach, could you share us your idea?

doganulus commented 1 month ago

Hi @mitsudome-r, I have shared a snippet of the solution here. It might be missed in a merged PR: https://github.com/autowarefoundation/autoware/pull/4712#issuecomment-2121881833

Regarding this comment:

Therefore, there is a trade-off between docker build time and complexity of docker build procedure at the moment. If we just think vcs import as steps for downloading Autoware source code, I don't think it's that unnatural to ask users to download the necessary codes before running docker build.

I think otherwise. Autoware should not ask users to install some packages too easily. As a user and developer, I don't want Autoware to mess with my local environment. Hence, the development environment must stay solely inside containers. This PR goes the opposite. That's the whole point.

I oppose setup-dev-env.sh or similar opaque scripts for the same reason. Please evaluate twice when asking users to install things. The same goes for package dependencies. I hope you see what bothers me but it requires an organization-wide understanding rather than individual efforts. I would appreciate it if Autoware became more strict about these matters.

oguzkaganozt commented 1 month ago

I hope the problem would be fixed. 6a9398d I'm checking docker-build-and-push again. https://github.com/autowarefoundation/autoware/actions/runs/9183845791

@oguzkaganozt By the way, why is the Upload Artifact step necessary to verify the docker build? It seems sufficient if Build and Save Artifacts is successful. Since more than 10 GB needs to be uploaded, it also seems to be taking time to transfer.

Uploading of images is necessary because you can test out the produced images without pushing them into the registry. Also it will be the part of whole CI-CD loop in the future by uploading the images into scenario cloud then from there we would test those images with pre-defined scenario simulations.

oguzkaganozt commented 1 month ago

@youtalk can rebase your branch with the latest main branch ?

oguzkaganozt commented 1 month ago

Hi @mitsudome-r, I have shared a snippet of the solution here. It might be missed in a merged PR: #4712 (comment)

Regarding this comment:

Therefore, there is a trade-off between docker build time and complexity of docker build procedure at the moment. If we just think vcs import as steps for downloading Autoware source code, I don't think it's that unnatural to ask users to download the necessary codes before running docker build.

I think otherwise. Autoware should not ask users to install some packages too easily. As a user and developer, I don't want Autoware to mess with my local environment. Hence, the development environment must stay solely inside containers. This PR goes the opposite. That's the whole point.

I oppose setup-dev-env.sh or similar opaque scripts for the same reason. Please evaluate twice when asking users to install things. The same goes for package dependencies. I hope you see what bothers me but it requires an organization-wide understanding rather than individual efforts. I would appreciate it if Autoware became more strict about these matters.

@doganulus the thing is, as we are importing all of the source files with an opaque vcs import then doing rosdep install on top of the imported src, that means that there is no direct way of achieving actual docker-caching mechanism if we keep vcs import inside the dockerfile.

So with the new src-imported layer we now have seperate layer just forvcs-import, but the source changes can't be detected again, so now @youtalk completely excluded the vcs-import and copy source files as directory which effectively enables docker-caching.

I don't really know if this solution of yours would still detect source changes, so if you can create another branch with your solution then we can test out and see if excluding vcs import really needed ?

doganulus commented 1 month ago

if you can create another branch with your solution then we can test out and see if excluding vcs import really needed ?

Please describe how you see a user or a developer to use these containers. How frequently? Then we can understand the thing you try to improve makes sense or not.

What would you need to produce after these efforts? I think these points are not clear.

xmfcx commented 1 month ago

docker image expectations

@doganulus in the first paragraph here you can see the use cases:

  1. The devel image enables you to develop Autoware without setting up the local development environment.
  2. The runtime image contains only runtime executables and enables you to try out Autoware quickly.

I would assume, the devel image workflow should include cloning the autoware repositories and users should be expected to mount the autoware directory. This is to have the changes persistent.

vcstool on the host machine

To me, vcs tool is an alternative to git submodules. Also relatively easy to install: pip install --no-cache-dir vcstool so I see no issues having this much on the host side. git, docker, vcstool

I'm not good at docker caching mechanisms so I cannot comment on if this is the best approach.

At least, it was the 2nd thing that was recommended by chatgpt, I didn't like its first proposal.

What about just changing that run step?

Instead of moving vcs out, would this approach work @youtalk -san?

RUN --mount=type=ssh \
  rm -rf src \
  && mkdir src \
  && vcs import src < autoware.repos \
  && apt-get update \
  && rosdep update \
  && DEBIAN_FRONTEND=noninteractive rosdep install -y --ignore-src --from-paths src --rosdistro "$ROS_DISTRO" \
  && apt-get autoremove -y && apt-get clean -y && rm -rf /var/lib/apt/lists/* "$HOME"/.cache

Unrelated topic

image

@oguzkaganozt I was looking into https://autowarefoundation.github.io/autoware-documentation/main/installation/autoware/docker-installation/#prerequisites and noticed the setup-dev-env.sh doesn't have any "docker" references, so is this documentation wrong or outdated?

youtalk commented 1 month ago

Instead of moving vcs out, would this approach work @youtalk -san?

@xmfcx It has the same cache problem. The RUN command will not be re-executed unless autoware.repos is modified. Note that it will also be re-executed if BASE_IMAGE is changed or if the contents of the COPY files before the RUN command are modified.

youtalk commented 1 month ago

When executing ./setup-dev-env.sh, by removing the --download-artifacts option, there was enough image size to pass the CI. Instead, we will download and mount the artifacts when executing /docker/run.sh. I'm testing on https://github.com/youtalk/autoware/pull/18

youtalk commented 1 month ago

@oguzkaganozt https://github.com/autowarefoundation/autoware/pull/4738#issuecomment-2124790959 I understood. Thank you for your explanation.

https://github.com/autowarefoundation/autoware/pull/4738#issuecomment-2124802226 I've merge the latest main branch https://github.com/autowarefoundation/autoware/pull/4738/commits/da2960b0f0a1aa3d265d6e60d60957ec3cfcb68a and restart the docker-build-and-push-main. https://github.com/autowarefoundation/autoware/actions/runs/9200680271

youtalk commented 1 month ago

Both of no-cuda and cuda were finally succeeded to run docker-build-and-push-main. https://github.com/autowarefoundation/autoware/actions/runs/9200680271/job/25307638386 https://github.com/autowarefoundation/autoware/actions/runs/9200680271/job/25307638224 So, let's merge!