KhronosGroup / DockerContainers

Docker container specifications which package dependencies for building Khronos documentation and software
Apache License 2.0
7 stars 10 forks source link

Extremely slow CI #6

Closed oddhack closed 3 years ago

oddhack commented 4 years ago

I'm not sure what repo CI is doing - the Azure files are beyond my pay grade at present - but it is taking approximately forever (35 minutes, at this point). If it's building all the images from scratch, let's not do that!

oddhack commented 4 years ago

Yeah, this is hopeless. The PR you pushed a few days ago timed out after an hour as did mine. I'm going to go ahead and merge #5, which really just changes the Vulkan Dockerfile. I may be changing it more often going forward so hopefully we can turn off the Azure stuff for the time being.

rpavlik commented 4 years ago

yeah, I don't really know what went wrong here. It used to be really quick, now it's hitting timeouts.

rpavlik commented 4 years ago

I've turned it off as a requirement for merge. I assume you're just pushing the updated container yourself?

Glad to see you found that entrypoint script useful - that's based on a marvel of a script I found some years ago and have been continuing to use (and improve) since then :)

oddhack commented 4 years ago

entrypoint is useful, but confusing - I don't know where it's getting $args from, but it appears to be a big hunk of shell code that itself executes a shell, and for some reason it was instantly erroring out when I ran it. So I just hardwired it to bash in my version.

The very stringent constraints about some special characters in Dockerfiles have also been painful (and not well-documented AFAICT).

rpavlik commented 4 years ago

Yeah, I don't know either. usually it's nothing or the thing you pass on the command line. in OpenXR there is a counterpart to that script https://github.com/KhronosGroup/OpenXR-Docs/blob/master/open-in-docker.sh which mostly just handles setting the env vars to pass the UID and GID in, and making the mount. (It's the "pre-entrypoint")

That said, I have found that Azure Pipelines (and thus presumably also github actions, but those aren't enabled on this org) doesn't really like having an entrypoint set, which is why the OpenXR images are split into a "-base" and a regular: gitlab-ci and manual docker usage use the non-"-base", while azure uses the "-base".

oddhack commented 4 years ago

@rpavlik I came across an interesting entrypoint thing just now: when I build the image, some of the components there are locally compiled (the Roswell LISP environment used for the chunked HTML spec generation) and installed to $HOME at build time, e.g. /root. Not only is that not $HOME when running as another user, it isn't even accessible to regular users. ISTM the easiest way to deal with this is to unprotect /root and link from the docker user's $HOME/.roswell to /root/roswell in the entrypoint script, but am wondering if you've already run into this and have a more sophisticated approach?

N.b. I was dismayed to find that 'docker run' combined with bind mounts seems to allow an arbitrary user to run an arbitrary image which runs as root in the image, and can create files in my host filesystem owned by root! Maybe that's a docker configuration error on my part? I'm not sure it's exactly a security breach yet, since I can't set the setuid bit on these files (well, I can, but it turns into the sticky bit on the host).

rpavlik commented 4 years ago

Yeah, running Docker is a security issue. Though, I don't think you need su to create a root-owned file, think you can always chown stuff away from you.

I think ideally you'd have roswell install to /usr/local, and probably in a two-stage build like I have for the spec generation thing: the dockerfile specifies two machines, and only the first one has the build environment required, the second is only runtime.

I'd personally love to move to something a bit like this: https://freedesktop.pages.freedesktop.org/ci-templates I've used it in a bunch of other projects, and it's very handy. https://gitlab.freedesktop.org/monado/monado/-/blob/master/.gitlab-ci.yml It also makes very small images by avoiding intermediate things. (A very complete toolchain for building our runtime in 32 and 64 bit weights 970MiB in this way: https://gitlab.freedesktop.org/monado/monado/container_registry ) Unfortunately the tools appear to be redhat driven and not available (yet) on Debian, so I haven't been able to play with making images in a similar way on my local machine yet - hasn't been high enough priority.

(The build process used to only take 5 minutes. I'm not sure why Azure got so slow for this task.)

oddhack commented 4 years ago

chown root .login
chown: changing ownership of '.login': Operation not permitted

How much does the two-stage build help - is it mostly storage / bandwidth on downloading the docker image when it changes, or does it benefit startup time significantly? It looks like a PITA to figure out all the special exceptions of files that need to be retained, based on skimming your xr Dockerfiles, so I haven't tried so far.

rpavlik commented 4 years ago

I think it's just storage/bandwidth upon change. It is a bit of a pain, though fortunately most things have standard install locations. The freedesktop system is a much nicer way to achieve a similar result.

oddhack commented 3 years ago

Closing - no action to be taken at present.