Shrink - Githubissues

vsoch commented 2 years ago

Replaces #65

vsoch commented 2 years ago

This looks to be the same error (multiple times)

spaaaaack!

trws commented 2 years ago

More seriously, I'll take a look at it. My binutils external detection PR merged, so that will help. The biggest things left are actually getting a sparse checkout, which I want to talk over with you, and possibly dealing with this: https://github.com/spack/spack/pull/29031

trws commented 2 years ago

Ok, I'm going to add the fixes for the gold issue and some other cleanup here now that it proved out in the clang update.

trws commented 2 years ago

This should now build a good bit faster, since it's no longer building cmake in spack, but it has more in it that downstream packages can use, and configures spack to automatically reuse what's there.

If you don't like any of this, say so and I'll pull it out, but I also re-worked it to use an environment to manage the view, and configure spack through the environment to use that so a spack install in a downstream container will automatically install into the environment and update the view. It also updates the paths for man, pkgconfig, aclocal and cmake so CI or others that use these with build systems that are not spack will find all spack built components without extra work.

Was arguing with myself about getting rid of the git overhead for grabbing spack. We could grab the commit and unpack it with: curl -L https://github.com/spack/spack/tarball/9e2d78cffc1beaec2c6e4c3a379c17f25a35b31f | tar -xz

That drops the container size by ~100mb, and fetches in less than 5 seconds, but of course the result is not a functioning git repo. If the point is to keep the one pinned commit, then that's probably fine, but if not... Not sure, what do you think?

The follow-on question is, now that this builds in just over 3 minutes in github actions, what do you think about wiring up the build so the containers based on this one to build based on the version for the current day? I have no idea how complex this is, so treat that as a completely naive question.

vsoch commented 2 years ago

Sorry didn't see this been so distracted with dwarf and gather.town today lol! Let me take a look.

vsoch commented 2 years ago

That drops the container size by ~100mb, and fetches in less than 5 seconds, but of course the result is not a functioning git repo. If the point is to keep the one pinned commit, then that's probably fine, but if not... Not sure, what do you think?

I think we probably want a functioning git repo for anyone that wants to use a different spack history commit, etc.

vsoch commented 2 years ago

The follow-on question is, now that this builds in just over 3 minutes in github actions, what do you think about wiring up the build so the containers based on this one to build based on the version for the current day? I have no idea how complex this is, so treat that as a completely naive question.

You mean the other matrix builds? They will already use whatever is latest for the base images. Or do you mean something else?

trws commented 2 years ago

The follow-on question is, now that this builds in just over 3 minutes in github actions, what do you think about wiring up the build so the containers based on this one to build based on the version for the current day? I have no idea how complex this is, so treat that as a completely naive question.

You mean the other matrix builds? They will already use whatever is latest for the base images. Or do you mean something else?

You've mentioned a couple of times that there's a day delay, I assume because when the build starts each one picks up the base that exists when it's created rather than the one that would be generated that day. Perhaps I misunderstood?

vsoch commented 2 years ago

You've mentioned a couple of times that there's a day delay, I assume because when the build starts each one picks up the base that exists when it's created rather than the one that would be generated that day. Perhaps I misunderstood?

That can happen yes - but the base images workflow triggers before the matrix images, so I think they could be used the same day but I cannot guarantee it. What does have a day delay is a non base image (e.g., a matrix build) that uses another matrix build (since they build at the same time).

trws commented 2 years ago

Ah, got it, so probably nothing to do then. Great!

If you're happy with this behavior, I can propagate it to the other Ubuntu base images. What do you think?

vsoch commented 2 years ago

I think that would be ok with me - ping @davidbeckingsale to check! I'm wondering if we have a lot of redundancy in chunks that we run if we should have scripts to copy and run instead? :thinking:

vsoch commented 2 years ago

okay looks like I can see GitHub on my other computer! That's super weird.

Okay - so for next steps do we want to update the PR here with the other containers, and then we can see the recipes side by side and decide if there is any redundancy to put in one place?

trws commented 2 years ago

We have a lot of redundancy right now, especially after what I added. I was actually planning to ask you about scripts, shall I add a scripts directory either at the ubuntu level or at the top? At least the spack setup and the cmake setup could easily be scripts, will take a look at the others.

vsoch commented 2 years ago

@trws we can remove the autamus cache, it's going to be refactored to be something different and probably mostly broken given all the hash changes anyway.

trws commented 2 years ago

It seems like the clang builds are failing with a bus error in the clang build again, I think that means they are not pulling the up-to-date ubuntu images, since those configure spack to re-use the system python. 🤔

trws commented 2 years ago

Aside from the comments, which I'll work on in a second, I need to figure out how to get the clang builds to use the ubuntu builds from this PR, or how to get them out of this entirely. They depend on things from the new base, and because they're using an older version as base instead they consistently fail even if they build. Trying reverting the clang dockerfile for now.

vsoch commented 2 years ago

@trws for a PR if you make changes to a file it will build, so you are 100% correct! Just remove the clang changes and we can open a PR for that separately after the base ubuntu's are merged/deployed (and then the matrix ones won't fail). It's definitely not a perfect system having base containers --> used by --> matrix containers but if you have an idea to make it better we can definitely try!

btw if the lab had peer bonuses I'd totally give you one for this work, I am extremely grateful for the time/attention you are putting into these improvements!

trws commented 2 years ago

Cool! And thanks, I'm happy to help get some of this stuff going again. We need a solid base to use for our testing and deployment stuff, it's well worth it, and I like how you set this up. My biggest gripe with our old way was getting newer or "latest" versions was a pain, having uptodate in the mix should make that a lot nicer.

As a side-note, while we do not have peer bonuses (or real bonuses in general) comp does do "SPOT awards" nominations that one can put in for other employees. Not fishing here, but I've heard group leads asking for more nominations so it may be something to keep in mind.

trws commented 2 years ago

Ok, I think this one is cleaned up and ready to go. There will need to be at least one downstream PR to do clang and the other ubuntu downstream containers, and if you would like I can apply this to the alpine base as well.

vsoch commented 2 years ago

They are deployed! :partying_face:

rse-ops / docker-images

Shrink #66