project-copacetic / copacetic

🧵 CLI tool for directly patching container images using reports from vulnerability scanners
https://project-copacetic.github.io/copacetic/
Apache License 2.0
843 stars 57 forks source link

[REQ] Support RPM-based images with valid rpm status but missing tools #602

Open ashnamehrotra opened 2 months ago

ashnamehrotra commented 2 months ago

What kind of request is this?

None

What is your request or suggestion?

https://github.com/project-copacetic/copacetic/blob/d648155f5424a9f4cb13acd7209195846791873b/pkg/pkgmgr/rpm.go#L333

Turning copacetic TODO comments into issues from https://docs.google.com/spreadsheets/d/1XwNj1J6e2FrUhlqaIsV10l8_tgov7WodlkvpNZXYZMU/edit#gid=1386834576.

Are you willing to submit PRs to contribute to this feature request?

MiahaCybersec commented 1 month ago

I've done some of the initial work required for this feature which is available in my fork of the repo available here.

There are two approaches we can take to this feature, both of which are documented below.

Approach 1 - Reusing The Approach Used For Distroless

This is the approach currently taken in my fork of the repo linked above. While it generally works, there are 2 roadblocks that prevent this from working flawlessly.

To fix the first issue, we could add 3rd party yum repos into the CBL mariner tooling image. Deciding on which one should be used and how to handle packages which exist in more than one repo could be a challenge.

The second issue will require further investigation for me to determine why that directory is missing.

Reproducing Above Issues

Clone my git branch (https://github.com/MiahaCybersec/copacetic/tree/valid-rpm-status-no-tools)

With no further changes, using the patch commands below will reproduce the first issue mentioned.

To skip over missing packages in the CBL mariner package manager repos, add --skip-broken to the yumdownloader commands on lines 482 and 490 in rpm.go. This will tell yum that if any packages are broken or missing, simply skip over them and continue executing anyway.

From here, choose an image that is missing RPM tools. There are two images I've been working with below as examples.

patch -r /calico-node-3.23.3-32.json -i docker.io/calico/node:v3.23.3-32-g2b86bba8df1f --debug patch -r /ubi8-micro8.7.json -i docker.io/redhat/ubi8-micro:8.7 --debug

Once the above changes are made, we can reliably reproduce the second issue.

Approach 2 - Adding a New Function

This approach would allow us to handle images which have a valid RPM status but missing tools differently than a distroless image. While I have started initial work on this locally, I will likely need additional time and help learning LLB and BuildKit.

Due to this approach relying upon the CBL mariner tooling image, we'd still run into the first issue listed under the first approach. It is possible other issues may arise while taking this approach, but those issues are not currently known if they exist.

ashnamehrotra commented 1 month ago

@MiahaCybersec The var/lib/rpmmanifest folder is only present in distroless images. We check for that folder to determine if an image is distroless, and the filescontainer-manifest-1and container-manifest-2 in that folder contain information about which packages are present in the image.

Since this folder does not exist for non-distroless images, it might be best to add a new function that follows the same workflow as rpm.installUpdates (currently we use this function for non-distroless images with tools installed) . However, the new function could use a tooling image like we do in rpm.unpackAndMergeUpdates to install the packages. This way we can directly use yum to install the packages to the tooling image, but won't encounter errors by extracting packages into a folder that doesn't exist. This can also be a modification to the rpm.installUpdates function itself if we reach this case.

This tool can help you visualize image filesystems to see the distroless vs non-distroless image structure - https://oci.dag.dev/

ashnamehrotra commented 1 month ago

Regarding the missing packages, I'm not sure if we would be able to get around that without --skip-broken since there is not a way to download the packages outside of the mariner tooling image, or install a tool to the image directly. A different approach could be mounting yum/tdnf from the tooling image to the current image and running the install commands, but I'm not sure if this would work and how we would get the diff of the two states via buildkit. @cpuguy83 may have a better understanding of if this is possible?

MiahaCybersec commented 1 month ago

Thanks for the clarification Ashna! I've begun working on implementing the function to mount the necessary tools into the user supplied image if the tools are missing. I'll do my best to keep the issue updated as progress is made.

MiahaCybersec commented 4 weeks ago

The function is mostly implemented locally, but some additional debugging will be required before it works properly. I'm hoping to have a PR up to close this issue later this week. I'll update the issue if there are any roadblocks that may delay this feature being implemented.

MiahaCybersec commented 2 weeks ago

I've implemented the function, but I'm encountering an issue at line 464 of rpm.go. You can find the current code here.

The program currently throws the following error:

=> => # runc run failed: unable to start container process: exec: "/mnt/usr/bin/yumdownloader": stat /mnt/usr/bin/yumdownloader: no such file or directory

Error: process "/mnt/usr/bin/yumdownloader install -y glibc libcap rpm-libs systemd-libs pam zlib glibc-common glibc-minimal-langpack ncurses-libs openssl-libs rpm ncurses-base shadow-utils systemd-pam" did not complete successfully: exit code: 1

I've tried using different directories for the binaries, but the error persists. To diagnose the issue further, I'm exploring two options:

MiahaCybersec commented 2 weeks ago

I have a somewhat functional build, but it appears that in order to get it fully operational, we will need to deal with dynamically linked libraries. All of the code I'm referring to is on my fork of Copa.

In order to execute anything, we must invoke /usr/bin/bash -c, which will utilize the user supplied image's bash shell. Since we are dealing with a mounted filesystem, some dynamic libraries will mismatch and cause issues. To get around the dynamically loaded libraries problem I've been attempting to utilize LD_LIBRARY_PATH which tells the system where to look for libraries.

Once LD__LIBRARY_PATH is set, if we try to execute any of the tooling required to update the user supplied image (RPM, yumdownloader, tdnf, etc.) we the following symbol lookup error: /mnt/lib/libc.so.6: undefined symbol: _dl_audit_symbind_alt, version GLIBC_PRIVATE which appears to be due to a glibc version mismatch.

It's worth noting that these same errors apply to both the user supplied image and tooling image bash shells.

I have been investigating gcc and other methods of loading libraries alongside an executable, but I have not yet been able to execute the tooling required to update the user supplied image.