Open bengland2 opened 3 years ago
Awesome, thank you for putting this together. I've been wanting to work on making a base image for a long time, here was the one that I came up with, would love your thoughts on it: https://gist.github.com/learnitall/9a84c4e035765d5d450c6d01644af654.
Do you have the logs for the failed smallfile package thought? It would be great to see those, as I'm curious what the error was. May be related to https://github.com/cloud-bulldozer/benchmark-wrapper/pull/323. Thanks!
@learnitall here the pastebin that illustrates problem I was having: http://pastebin.test.redhat.com/986862 actually this suggestion does not really resolve the problem I was having, it just means I don't have to deal with it all the time, only have to deal with it when the run_snafu base image changes. Any suggestions on why the original error is occurring? I thought files.pythonhosted.org would be more robust.
@learnitall Happy to use your snafu base image as long as I can make it work with the benchmarks that I use, will try it out. I'm guessing you looked at more benchmarks than I did.
questions:
Why isn't python3-pip RPM installed? Where do you get pip from?
why python 3.6 specifically?
why this? pip install wheel setuptools.
why this at end? rm -fr /opt/snafu
thx -ben
tried running Ryan's base image above and it didn't succeed, why not?
http://pastebin.test.redhat.com/987114
MIRROR] perl-podlators-4.11-1.el8.noarch.rpm: Curl error (56): Failure when receiving data from the peer for https://cdn-ubi.redhat.com/content/public/ubi/dist/ubi8/8/x86_64/baseos/os/Packages/p/perl-podlators-4.11-1.el8.noarch.rpm [OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 104] [MIRROR] groff-base-1.22.3-18.el8.x86_64.rpm: Curl error (56): Failure when receiving data from the peer for https://cdn-ubi.redhat.com/content/public/ubi/dist/ubi8/8/x86_64/baseos/os/Packages/g/groff-base-1.22.3-18.el8.x86_64.rpm [OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 104] [FAILED] groff-base-1.22.3-18.el8.x86_64.rpm: No more mirrors to try - All mirrors were already tried without success
Is this package download so fragile that an occasional network error can bring it to its knees? This isn't pip failing, it's cdn-ubi.redhat.com! Seems like there should be some sort of option that says wait a few sec and retry
Did some testing @bengland2 and I still am not sure as to why there were some network errors occurring, but I revisited the base image I posted earlier today and made some modifications based on your questions and work. It does depend on a PR currently under review (#323), therefore I created a new branch in my fork from that PR where I've posted the base image I came up with and modified the smallfile wrapper image to use said base image (it's located here: https://github.com/learnitall/benchmark-wrapper/tree/feature-add-base-image). When you get a chance, can you try this out for me?
git clone https://github.com/learnitall/benchmark-wrapper snafu-base-image
cd snafu-base-image
git checkout feature-add-base-image
podman build . -t snafu:latest
podman build . -t smallfile:latest -f snafu/smallfile_wrapper/Dockerfile
Here are the answers to your questions above, feel free to let me know if you have any follow-up Qs:
Why isn't python3-pip RPM installed? Where do you get pip from?
I went with this idea to use pyenv to build Python from source, rather than using an RPM, which would give us finer grain control over the version of python that we use in our images. I scratched this idea though and just went with the RPMs, as the build time was crazy long.
why python 3.6 specifically?
No idea honestly, just went for it because it's the minimum version of Python that snafu can use. I upgraded to 3.8 in the base image mentioned above.
why this? pip install wheel setuptools. why this at end? rm -fr /opt/snafu
When we use pip install -e .
we are asking pip to install snafu in editable mode and keep the source code from the git repository that was copied into the image. When we install wheel, we get access to the bdist_wheel
command within the setup.py
file, allowing us to build a minimal distribution of snafu which is then installed. This allows us to scrap the data we copied into the image, which includes a lot of unnecessary stuff like git history, docs, other Dockerfiles, etc.. This keeps trim down the image size, improving pull time.
Thanks!
@learnitall I'm trying to build your base image for snafu and it keeps blowing up with curl errors, I googled and curl in the container image is really out of date, it gives me errors like:
[MIRROR] gcc-8.4.1-1.el8.x86_64.rpm: Curl error (56): Failure when receiving data from the peer for https://cdn-ubi.redhat.com/content/public/ubi/dist/ubi8/8/x86_64/appstream/os/Packages/g/gcc-8.4.1-1.el8.x86_64.rpm [OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 104]
and then the container build aborts, but I think this is a lack of robustness in the curl version being used:
[bengland@localhost benchmark-wrapper]$ podman run -it c06e3fa0fce7
[root@897c7ef92b00 /]# curl --version
curl 7.61.1 (x86_64-redhat-linux-gnu) libcurl/7.61.1 OpenSSL/1.1.1g zlib/1.2.11 brotli/1.0.6 libidn2/2.2.0 libpsl/0.20.2 (+libidn2/2.2.0) libssh/0.9.4/openssl/zlib nghttp2/1.33.0
Release-Date: 2018-09-05
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz brotli TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL Metalink
but current version on Fedora 34 is:
[bengland@localhost benchmark-wrapper]$ curl --version
curl 7.76.1 (x86_64-redhat-linux-gnu) libcurl/7.76.1 OpenSSL/1.1.1k-fips zlib/1.2.11 brotli/1.0.9 libidn2/2.3.2 libpsl/0.21.1 (+libidn2/2.3.0) libssh/0.9.5/openssl/zlib nghttp2/1.43.0
Release-Date: 2021-04-14
Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS brotli GSS-API HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM NTLM_WB PSL SPNEGO SSL TLS-SRP UnixSockets
and this version appears to have some fixes in it that would make it more robust in the face of download problems. any suggestions? So I tried fedora:34 instead of ubi8 as base image to isolate the problem. When I take base image Dockerfile above and just change the image from ubi8 to fedora:34, it no longer had download problems . It does complain about peer resetting the connection during download, but it just keeps going and succeeds first time. With ubi8, it failed every time on a different package. ideas?
never mind, I moved from -2G wireless network to -5G wireless network in my house and now it builds, will let you know if it works for smallfile.
@learnitall I based smallfile on your image and it worked. can you include git in the base image? This would save some time for things like smallfile that aren't part of an RPM. Other than that, I'm good with it. Would be good to test fio with it also. So my dockerfile prototype currently looks like:
FROM quay.io/bengland/snafu:ubi8
RUN dnf install -y git
COPY . /opt/snafu
RUN git clone https://github.com/distributed-system-analysis/smallfile /opt/smallfile
RUN ln -sv /opt/smallfile/smallfile_cli.py /usr/local/bin/
RUN ln -sv /opt/smallfile/smallfile_rsptimes_stats.py /usr/local/bin/
Ah ok awesome. Glad to hear that switching your wireless helped out with the downloads, I was getting worried there. I edited the base image to include git and perform the soft linking of python3
to python
. I modified the smallfile Dockerfile within the branch I shared with you yesterday, can you check it out? It matches your prototype almost exactly, just want to double check it works for you. I'll work on the fio Dockerfile later today.
will do, been busy, it's on my list.
@learnitall back to it now, I tried your image out and it's great, let me know when it is part of benchmark-operator so I can start converting storage benchmarks to use it. Or should I add it?
it turns out smallfile is even easier since none of the remaining RPMs were even necessary to run it, they were more for debugging and I can leave those out of the image. So all that's left is git clone basically. image build ran in 10 seconds and image push to quay ran ino 5 seconds!!!
For fio, I ran into a problem doing this, I had to install some additional RPMs and it was complaining that subscription manager was not registered, but centos8 repo kicked in , it took a little longer but still was under 1 min, and push to quay.io was under 1 min. Again wonderful. So I see no reason why this wouldn't work. So my smallfile_wrapper/Dockerfile looked like:
FROM quay.io/bengland/snafu:latest
ADD https://api.github.com/repos/distributed-system-analysis/smallfile/git/refs/heads/master /tmp/bustcache
RUN git clone https://github.com/distributed-system-analysis/smallfile /opt/smallfile
RUN ln -sv /opt/smallfile/smallfile_cli.py /usr/local/bin/
RUN ln -sv /opt/smallfile/smallfile_rsptimes_stats.py /usr/local/bin/
and my fio Dockerfile looked like:
FROM quay.io/bengland/snafu:latest
COPY snafu/image_resources/centos8.repo /etc/yum.repos.d/centos8.repo
RUN dnf install --nodocs -y --enablerepo=centos8 make gcc libaio zlib-devel libaio-devel
RUN dnf clean all
RUN curl -L https://github.com/axboe/fio/archive/fio-3.27.tar.gz | tar xzf -
RUN pushd fio-fio-3.27 && ./configure --disable-native && make -j2
RUN ln -sv /fio-fio-3.27/fio /usr/local/bin/
COPY . /opt/snafu
BTW if you take out --depth 1 from git clone, then you can easily fetch a branch for debugging, so I prefer no --depth 1.
hopefully the base image is resolved by benchmark-wrapper PR #319 and we can rapidly convert benchmarks to use that, if I understand Ryan correctly.
I found a way to speed up fs_drift_wrapper/Dockerfile that results in very fast image rebuild -- it appears that podman caches layers of image based on the order of steps in the Dockerfile, and by putting the RPM install and the pip install steps first, we always get this stuff cached and so the only thing it has to actually change is to clone fs-drift and copy benchmark-wrapper tree to the image, this runs in a few seconds. The Dockerfile I'm using is:
FROM registry.access.redhat.com/ubi8:latest
RUN dnf install -y --nodocs git python3-pip
RUN ln -s /usr/bin/python3 /usr/bin/python
COPY setup.cfg setup.py version.txt /opt/snafu/
RUN pip3 install -e /opt/snafu/
RUN git clone https://github.com/parallel-fs-utils/fs-drift /opt/fs-drift
RUN ln -sv /opt/fs-drift/fs-drift.py /usr/local/bin/
RUN ln -sv /opt/fs-drift/rsptime_stats.py /usr/local/bin/
COPY . /opt/snafu/
Working on integration into our build system (have a couple of tweaks to make to get this to work) and then will start migrating Dockerfiles into using the base image.
@learnitall update?
Hey Ben, been distracted with some other work that has come up since Joe left. I have experimented with creating a base image using the ONBUILD
syntax and I think it's the way to go. The current image that I have requires a specific distro to be used by each wrapper image, however using ONBUILD
and packing snafu correctly, I can create a base image compatible with any FROM
image a wrapper needs to use. See here for an example base image, and here for example usage.
Still working on this issue, it's just been a low priority for me here with some other tasks taking my attention.
I was having problems rebuilding smallfile image, because pip kept failing to download a random python package, I got really annoyed and started thinking of ways to prevent this from happening, and came up with the idea of having a snafu base image that contained just the stuff that run_snafu.py needs, which is quite a lot actually. Then you make the individual benchmarks build off this base, so they don't have nearly as heavy a lift. For example:
and then the smallfile image becomes:
and it rebuilds really fast. Would anyone else like to see this implemented? I could post a PR for this with smallfile and then we can incrementally extend it to other benchmarks if there is interest. Possible benchmarks that could use this would include: