epi-project / brane

Programmable Orchestration of Applications and Networking
Apache License 2.0
10 stars 8 forks source link

Broken make.py #172

Open DanielVoogsgerd opened 1 week ago

DanielVoogsgerd commented 1 week ago

After the changes introduced in the cargo-deny merge (#140). We missed that these changes to Cargo.toml are not supported by the custom Cargo.toml parser in make.py. So currently, make.py does not work any more. How do we like to resolve this? I love writing parsers as much as the next guy, but having a custom non-spec compliant parser is a big pain with respect to maintainability.

I have to ask, how much of make.py really necessary for us. It contains ~5% of our lines of code at this point. Are we better off with something a lot more simple at this point?

EDIT: I might be incorrect, but I get the feeling that if we drop cross compilation / in container compilation (which I tend to run with -f, for caching reasons) and automatic installation of containers, we might be able to fit the whole making procedure into a ~100 line Makefile. I am not the biggest fan of the Makefile syntax, but it is obiquitous and fairly transparent. I think we can do this because almost everybody that is going to develop Brane is going to build on a machine that is capable of running it, at least if they want to test it. The release images can be build using GitHub actions on a native platform.

Lut99 commented 1 week ago

So, all the way back at the beginning of time, I introduced the make.py mainly to build Brane outside of a container with normal Cargo caching and then push the results in a container (actually there was make.sh first but at some point this became Python for my sanity). This because a) I hadn't discovered Buildx' RUN --cache-directive yet, and b) I was running this on an old macbook and the performance of in-container building was truly awful with the old VM drivers that Docker used to use. So I spend a lot of energy in making this "build outside of container, then put in container" process with OpenSSL building and all of that.

Needless to say, I think that we can move to just using the off-the-shelf Buildx --cache nowadays and then either create a deadly simple Makefile or Python script, whichever has your fancy. Just like Brane itself, the make.py is riddled with a thousand features that are unmaintained :) so going back to something simpler is not just a luxury.

Just a few things to note:

EDIT: I might be incorrect, but I get the feeling that if we drop cross compilation / in container compilation (which I tend to run with -f, for caching reasons) and automatic installation of containers, we might be able to fit the whole making procedure into a ~100 line Makefile.

I'm not exactly sure what you're suggesting to drop here with "in container compilation", but if you're suggesting to compile Brane from scratch everytime then plz don't. My heart couldn't take it :joy: Even on very beefy machines that means that every change takes ten minutes to compile, and I don't have the patience haha.

So what caching issues are you running into when compiling Brane in-container?

DanielVoogsgerd commented 1 week ago

So, all the way back at the beginning of time, I introduced the make.py mainly to build Brane outside of a container with normal Cargo caching and then push the results in a container (actually there was make.sh first but at some point this became Python for my sanity). This because a) I hadn't discovered Buildx' RUN --cache-directive yet, and b) I was running this on an old macbook and the performance of in-container building was truly awful with the old VM drivers that Docker used to use. So I spend a lot of energy in making this "build outside of container, then put in container" process with OpenSSL building and all of that.

Makes sense

Needless to say, I think that we can move to just using the off-the-shelf Buildx --cache nowadays and then either create a deadly simple Makefile or Python script, whichever has your fancy. Just like Brane itself, the make.py is riddled with a thousand features that are unmaintained :) so going back to something simpler is not just a luxury.

Yeah I think so as well.

Just a few things to note:

* IIRC, doing something like "ensure there exist a Buildx builder and build it if it doesn't" is very ugly in Makefile. This is actually another reasoner why I once switched to Bash/Python, because it used to start Brane as well. Luckily, that's now done by `branectl`.

* The script also has (very unmaintained) features for downloading images and binaries, which were always of great help when many students were using the framework (so they don't have to build everything). It may be worth to port this somehow when talking about ease-of-installation. We can fix this either by pushing Brane images to docker hub next release (not a bad idea) or extending the once-started features of `branectl` to download the images instead.

Alternatively we can also just upload the packages to the registry native to GitHub (GitHub Packages it is called iirc)

* There used to be a Makefile way back 'yonder. You can go into the archives (commit history) to find it, maybe it's a good starting point in case you want to get started.

I will take a look.

EDIT: I might be incorrect, but I get the feeling that if we drop cross compilation / in container compilation (which I tend to run with -f, for caching reasons) and automatic installation of containers, we might be able to fit the whole making procedure into a ~100 line Makefile.

I'm not exactly sure what you're suggesting to drop here with "in container compilation", but if you're suggesting to compile Brane from scratch everytime then plz don't. My heart couldn't take it 😂 Even on very beefy machines that means that every change takes ten minutes to compile, and I don't have the patience haha.

So what caching issues are you running into when compiling Brane in-container?

Okay a couple of things here. I am not entirely sure when I run into the container cache build issues, I haven't really had the time to debug it, whenever it refused to build I just jot an extra -f in there and it works, which is fairly often.

So my idea for the new build environment is to remove it from docker entirely and attempt to use the native cargo/rustc caches (so no full compilation) and just move the binary into the container after building it. To address building on non-native architectures, I think using Cargo cross and then moving it into containers still is a better way to go, in those circumstances where cargo can't do it itself. This way we just existing infrastructure which is very well maintained and has very low complexity. But if this does work for Brane for some reason we can still just buildx I suppose.

Lut99 commented 1 week ago

Alternatively we can also just upload the packages to the registry native to GitHub (GitHub Packages it is called iirc)

Oh right! Yes that's even better.

Okay a couple of things here. I am not entirely sure when I run into the container cache build issues, I haven't really had the time to debug it, whenever it refused to build I just jot an extra -f in there and it works, which is fairly often.

Haha yeah that's OK. Just, I'm sorry for your build times... :sweat_smile:

So my idea for the new build environment is to remove it from docker entirely and attempt to use the native cargo/rustc caches (so no full compilation) and just move the binary into the container after building it. To address building on non-native architectures, I think using Cargo cross and then moving it into containers still is a better way to go, in those circumstances where cargo can't do it itself. This way we just existing infrastructure which is very well maintained and has very low complexity. But if this does work for Brane for some reason we can still just buildx I suppose.

I see. Well this is why my Makefile grew some complex :)

I remember that the first issue you'll run into on my laptop (and yours, too) is that Brane depends on linking GLIBC and that on rolling releases like ours the one it's linked to is usually too new for whatever the container runs. But you can use musl to statically link it instead (which is a Rust target so should be easy). Then there are some libraries which also do dynamic linking, chief of which used to be OpenSSL which, again, can differ significantly between platforms (openssl 1 VS 3). I think that's no longer an issue, though, because I told it to link statically IIRC; but maybe there are more, you'd have to check.

Then I think that there was a caching thing too, where the Makefile uses timestamps to check for outdated files but Cargo always updates the binaries regardless of whether source files have changed (I think is what used to happen, anyway). No necessarily a big deal, because Docker should use hashes on the binaries to decide if an image rebuild is necessary, but on my ol' macbook this was always an additional minute to my compilation because the hash of the binary takes so long :joy: Hence me parsing the Cargo files myself and checking if any update occurred to conditionally call Cargo.

It's definitely doable, but to get it working very smoothly it may require some effort. Still, it'll be loads faster if the caching of Cargo just works, so probably worth it. The only blocker I see is if you need some wizardry to do the system deps between native and Docker; that would very much ruin our goal of long-term, out-of-the-box compilation. So then continuing to build in Docker would be my pref.

DanielVoogsgerd commented 1 week ago

With respect to the issues with native compilation: Ah, that does indeed sound quite nasty. I assume those dependencies were taken with some consideration, and alternatives like rustls were considered, but that is a problem for future me to tackle.

It's definitely doable, but to get it working very smoothly it may require some effort. Still, it'll be loads faster if the caching of Cargo just works, so probably worth it. The only blocker I see is if you need some wizardry to do the system deps between native and Docker; that would very much ruin our goal of long-term, out-of-the-box compilation. So then continuing to build in Docker would be my pref.

I agree. I think being pragmatic here is the most important here. We should first move to a simple Makefile that basically calls the same docker buildx commands the main.py script calls now. This can be very basic, I think. From that point we can start adding the complexity back in a manner that we prefer. That could be focussing on static linking and going native, or keeping the containerization and optimize stuff there, or not.

You probably have a better mental model of what the downsides are of that "stupid" Makefile, but on the other hand, right now, we have an overly complex make.py with a Cargo.toml parser that cannot handle all the features we are currently using, and the last thing I want to do is dig that hole any deeper, by trying to extend that parser.

Sorry if that comes across a bit crass. I think it makes perfect sense from a historic perspective, but the maintenance of such a tool right now is simply too expensive.

Lut99 commented 1 week ago

Ah, that does indeed sound quite nasty. I assume those dependencies were taken with some consideration, and alternatives like rustls were considered, but that is a problem for future me to tackle.

...yes... :# :joy:

No haha, I don't see why rustls wouldn't work either. But right now it also works with the static compilation, so my point is more for other potential system deps (there was also libgit for a while etc).

I agree. I think being pragmatic here is the most important here. We should first move to a simple Makefile that basically calls the same docker buildx commands the main.py script calls now. This can be very basic, I think. From that point we can start adding the complexity back in a manner that we prefer. That could be focussing on static linking and going native, or keeping the containerization and optimize stuff there, or not.

You probably have a better mental model of what the downsides are of that "stupid" Makefile, but on the other hand, right now, we have an overly complex make.py with a Cargo.toml parser that cannot handle all the features we are currently using, and the last thing I want to do is dig that hole any deeper, by trying to extend that parser.

Sorry if that comes across a bit crass. I think it makes perfect sense from a historic perspective, but the maintenance of such a tool right now is simply too expensive.

Haha no worries, I get your point! Indeed, let's for now just get something simple going and see what we're then actually missing. Let's do this iteratively instead of preemptively :innocent:

Lut99 commented 6 days ago

Lol, I was checking the branches to see if there were any dependabot ones to delete, and check https://github.com/epi-project/brane/tree/new-make?

No idea what I did there, it's also like a 100 commits behind. But maybe I already made a start?

Lut99 commented 6 days ago

(P.S. If irrelevant, feel free to delete the branch)

DanielVoogsgerd commented 3 days ago

I wasn't particularly relevant, this was the zig cross compile experiment, with a written in rust build system. How meta!

Thanks for the headsup though :)

I went with a regular makefile instead. I was intrigued by Cargo-make for a second, but I went with Makefile because it is "simple" and ubiquitous.