flatcar / Flatcar

Flatcar project repository for issue tracking, project documentation, etc.
https://www.flatcar.org/
Apache License 2.0
653 stars 27 forks source link

[RFE] Need a good way to add complex Python apps #1470

Open dragonpaw opened 2 weeks ago

dragonpaw commented 2 weeks ago

Current situation

I'm trying to install SatlStack into Flatcar as I'll be using that to schedule reboots across the fleet, etc.

Looking over the different ways to extend Flatcar, it seemed like using the sdk container and adding the salt version from Gentoo's upstream repo would be the most supported way to do this. (I would kinda prefer to do this with a sysext, but that doesn't really seem plausible for a Python app with a bunch of library dependencies and such, plus it's already packaged in Gentoo.)

I came up with this script so far: https://gist.github.com/dragonpaw/1a22cbcc3ae05ed6ff5aade1dc2b8b29

This gets me a lot of the way there, but then I hit a wall with it complaining that there's no version of gnutls available:

there are no ebuilds to satisfy "app-crypt/gnupg[nls]" for /build/amd64-usr/.

(Other attempts end up in some strange loop where it tries to rebuild systemd, which is definiely not anything I want to touch.)

Impact

Right now, I can't find any way to install salt at all. The sysext stuff is cool, but seems to only work for static compiled binaries that can exist out of step with Flatcar releases.

Ideal future situation

Some supported way to add Salt, and other non-static compiled tools like most python apps.

Implementation options

I don't really have a preference for how this is made to work, be it a better way to create a sysext that involves maybe running something insides a chroot with an overlay, or just changing whatever is in the gentoo system to allow emerge to find a solution to the dependencies. Maybe some way to just boot the system, run the install commands, then clear out the machine state with flatcar_reset or the like.

tormath1 commented 2 weeks ago

Hello and thanks for your issue. Adding salt packages and friends to the SDK or the generic image is not ideal as it will increase the size of the image and add Python to the base image which is something we want to avoid as it's not directly related to container runtime, it will increase the attack surface and it will increase the footprint.

The sysext stuff is cool, but seems to only work for static compiled binaries that can exist out of step with Flatcar releases.

This is actually the opposite, sysext images can use dynamic libraries (e.g built-in sysext for Docker, containerd).

We recently added python sysext image (available in the next alpha) - you could try to leverage this image and add the missing pieces for Salt in another image in the sysext-bakery (https://github.com/flatcar/sysext-bakery). It's possible to use multiple sysext images on the same system.

dragonpaw commented 2 weeks ago

I'm certainly not asking you to add Salt to the base distro of Flatcar. But rather to make it possible for me to add it to my custom version. Because if I can't use Salt to manage it, I can't use the distro. (Changing the fleet management tool we use is a non-starter for me wanting to bring a new distro like Flatcar into the mix.)

I did see the sysext-bakery repo you pointed me at, but as for it being "the opposite" of being for static binaries, the first sentence of the documentation is that sysext-bakery is for "allowing to extend the base OS with custom (static) binaries. " As for it having a python sysext there, the search for 'python' in that repo turns up 0 code hits, and only 1 issue hit: https://github.com/search?q=repo%3Aflatcar%2Fsysext-bakery%20python&type=code

tormath1 commented 2 weeks ago

Yes, that's true sysext-bakery provides images with static binaries.

This is actually the opposite, sysext images can use dynamic libraries (e.g built-in sysext for Docker, containerd).

I was talking about sysext in a generic way, with Flatcar there are three ways to provide sysext images:

I'm not super familiar with Salt but is that not possible to run the agent from a docker image?

dragonpaw commented 2 weeks ago

Again, I have the problem that what you say is possible is in contradiction to the published documentation.

Sysext for example, supports 3 file format:

"System extension images may be provided in the following formats:

Plain directories or btrfs subvolumes containing the OS tree

Disk images with a GPT disk label, following the Discoverable Partitions Specification

Disk images lacking a partition table, with a naked Linux file system (e.g. erofs, squashfs or ext4) "

None of which are a docker containers. So according to the manual for sysext, supplying a docker container is not an option.

(You also mention that docker's binaries are compiled dynamically, but the Flatcar documentation says the opposite: "The Docker releases publish static binaries")

But the goal of this ticket isn't to argue about how Docker is compiled, as it's not in Python and irrelevant really to my question of 'Big, big Python package w/ many C libraries: How?"

Flatcar's own documentation provides this by way of how to customize the image: https://www.flatcar.org/docs/latest/reference/developer-guides/sdk-modifying-flatcar/#making-changes which says the supported way of adding to the image is to clone gentoo's repo and build from that if what you want is already in gentoo's repo. (Which it is and this is exactly what I want.)

As for the link you provide to the "Flatcar Release Extensions", I have no idea what I am supposed to learn from this link other than somehow, by undocumented means, a zfs plugin sysext is available. There's not even a single sentence for how this is accomplished or how I might accomplish a similar trick. If this is another path to customization of Flatcar, it doesn't seem to be documented.

The link you provide to "extra_sysexts.sh", has no comments, documentation, or anything to help me understand what this file does, how I would call it, how customizing it works, what possible values might go into it, etc. At first glance it seems to be gentoo package names, which puts me back at where I started. I am trying to add the gentoo package 'app-admin/salt' but it will not compile because it needs to install over 50 different packages to do so, including an incompatible version of gnutls.

As for if Salt can run inside a docker container, I have no idea. Salt is a very large tool that makes use of libssl, libzmq, libmsgpack, compiled crypto libraries, and a ton of other C libraries, plus 50+ python packages from pip, and it needs to run with full write access to the filesystem (outside of /usr which is of course readonly in Flatcar.) If I try to compile those c libs, I don't know if they'll end up being kernel dependent, depends what's in their ./configure scripts... But if it's running in docker, how would it trigger an OS reboot? (Which as you'll recall was one of my major use cases for having Salt on the system.) It'd also need /etc, /var, /opt, ... all bind mounted so it could configure the system. But the reboot thing seems like a problem. Maybe if systemd's socket was also bindmounted into the container? This is getting weird, but it might work...

tormath1 commented 2 weeks ago

None of which are a docker containers. So according to the manual for sysext, supplying a docker container is not an option.

I supposed I have been unclear in this point. When I talk about "built-in sysext for Docker" I mention this:

$ systemd-sysext status
HIERARCHY EXTENSIONS         SINCE
/opt      none               -
/usr      containerd-flatcar Fri 2024-06-14 11:21:27 UTC
          docker-flatcar
          oem-qemu

By default containerd and docker software are installed in Flatcar as systemd sysext images.

(You also mention that docker's binaries are compiled dynamically, but the Flatcar documentation says the opposite: "The Docker releases publish static binaries")

Correct but the Docker sysext image provided by the sysext-bakery is not the one built in Flatcar (it's an alternative in case you want to use a difference Docker version from the one shipped by Flatcar).

There's not even a single sentence for how this is accomplished or how I might accomplish a similar trick. If this is another path to customization of Flatcar, it doesn't seem to be documented.

This is because it is a low level contribution - there is not a straight-forward way to do this. You could have a look to these pull requests to see how things are done:

From here I see a few ways:

till commented 2 weeks ago

We use Ansible, which may be similar to Salt, for that I use an ansible role to bootstrap pypi and pip.

Maybe pypi could be sysext'd? I tried a few times, but didn't get anywhere. For now we extract pypi to /opt and invoke everything there.

dragonpaw commented 1 week ago

From here I see a few ways:

We tried doing both of these and failed both times to get it to do what we needed. (This was done and failed before I opened the ticket, and was the reason I opened this ticket.)

In the end we just gave up on using Flatcar. All of us combined and we can't come up with a viable alternative to apt install salt-master or emerge app-admin/salt that works. So we went with another OS.

P.S. I feel like no one took the time to really read my initial request. Because I documented in there quite clearly that I'd already tried both of the listed approaches and why both had failed for us. So to just tell me to try to do what I already tried to do, isn't very helpful to me. This is frustrating and leaves a bad experience. I came here for help, not to be told to go read documentation that I'd clearly already read multiple times and followed until I hit a wall.

chewi commented 1 week ago

I'm sorry this didn't work out for you. I think @tormath1 tried his best to help, but between him lacking knowledge on SaltStack and you lacking knowledge on Flatcar, there was a bit of a disconnect. This is a little more complex than the requests we usually get. I did initially find the thread a little difficult to follow myself. I would also say that the sysext stuff is still relatively new, so we haven't quite established a good method of building them for software with complex dependencies. Being a Gentoo developer, I would love to see better use of Gentoo packages in this area. We did an experiment last year to fill this gap using Gentoo Prefix, but it may still need some finessing. I was new to Flatcar then, and I've only recently returned to it, so I haven't yet had a chance to explore this further. I imagine we'll see more requests like this though, so I'd like to address it before long.

chewi commented 1 week ago

I now remember we discussed this on Matrix. My fault entirely for missing your later replies. Maybe it's too late now, but to answer your question on nls, create /etc/portage/profile/package.use.mask/salt in the SDK with this:

net-libs/gnutls -nls
t-lo commented 1 week ago

Hello @dragonpaw ,

I feel like no one took the time to really read my initial request. Because I documented in there quite clearly that I'd already tried both of the listed approaches and why both had failed for us. So to just tell me to try to do what I already tried to do, isn't very helpful to me. This is frustrating and leaves a bad experience. I came here for help, not to be told to go read documentation that I'd clearly already read multiple times and followed until I hit a wall.

Please note that, in accordance with our mission statement, Flatcar is an immutable minimal footprint distro for running container workloads. While we understand this to be a rather opinionated approach, it implies that having a lean base image that is not runtime extensible is by design, and adding complex tools to even a customised base image can be challenging. That said, we will happily support you if you're still interested - but given the complexity of your request please don't expect it to be a low-hanging fruit,

Let's not discuss packaging and shipping at this point (this would include sysext, prefix builds, and similar options) as that can be easily solved later (based on a custom base image). For now we should focus on successfully building a custom base image that ships Salt. For this, let's follow the documented process for adding new packages to the base image; this also works for complex packages like Salt.

A few comments on the current approach in https://gist.github.com/dragonpaw/1a22cbcc3ae05ed6ff5aade1dc2b8b29:

Carefully following the "adding a package" section in our developer docs should get you there; please let us know if you're still interested on distro work or if we should rather close this issue.

chewi commented 1 week ago

To be fair, this specific case of a USE flag being masked in the profile is not documented, and it is not something I would expect even most Gentoo users to know how to handle.

till commented 1 week ago

@tormath1 where exactly is the python sysext? It seems relevant to my interests. 😁 You mentioned it but I don't see it in the list or in the bakery repo.

tormath1 commented 1 week ago

@till it's about to be released next week in the Alpha channel - you can try it with the nightly build in the meantime: https://bincache.flatcar-linux.net/images/amd64/4006.0.0+nightly-20240619-2100/ Same as ZFS, you need to provision the instance with python in /etc/flatcar/enabled-sysext.conf to pull the image at boot. Let us know how it goes!

till commented 1 week ago

Can I add that file and reboot too? Or is this only for new instances?

dragonpaw commented 1 week ago

Please note that, in accordance with our mission statement, Flatcar is an immutable minimal footprint distro for running container workloads. While we understand this to be a rather opinionated approach, it implies that having a lean base image that is not runtime extensible is by design, and adding complex tools to even a customised base image can be challenging. That said, we will happily support you if you're still interested - but given the complexity of your request please don't expect it to be a low-hanging fruit,

Yes, and I read that documentation. Multiple times. I fully understand the idea of keeping base as small as possible. And yes I fully agree with the idea that what I am doing is not something many/most will want in their base image. But it is what I need. Schooling me on the goals of the project will not help with the fact that I have a real and inflexible business need, and was attempting to use this cool project to meet that need. It is a need however that I cannot just walk away from and say "Oh, nevermind managing the fleet..." Salt is something that gets used as scale to manage very large fleets and companies that use it, need it. (First company I introduced it at was back in ~2003 and we managed over 200,000 servers of many different Unix (and Linux) flavors with it.)

So please, believe me when I say that I read all the documentation, and I fully, completely, 100%, utterly and without exception understand both the published documents, and the goal you set out to create. Nothing I've said in any of my long replies has, I believe, ever indicated that I've simply not read the docs, or didn't understand what a containerized distro is about.

But I need Salt. So it's find a way to do that, or use a different distro. There is no possible third outcome to my situation.

Let's not discuss packaging and shipping at this point (this would include sysext, prefix builds, and similar options) as that can be easily solved later (based on a custom base image). For now we should focus on successfully building a custom base image that ships Salt. For this, let's follow the documented process for adding new packages to the base image; this also works for complex packages like Salt.

Yes, this is exactly what I did. And it didn't work. And that's why I opened the ticket. Like I said, I fully and completely understand the documentation on how to extend the base as it is written, and I followed those docs literally and completely. More than once to be sure it wasn't a error on my part. So again, please, please, please believe me that I read the documentation and no amount of pointing me at links to the documentation as it is today will achieve the need I have.

A few comments on the current approach in https://gist.github.com/dragonpaw/1a22cbcc3ae05ed6ff5aade1dc2b8b29:

I copied the class only when, as indicated in the written instructions, that the class wasn't resolving properly. In this case, the app-admin/salt references a newer version of Python than in is the Python class files, and so it was impossible to progress without this copy statement.

As lined out in the developer documentation, please only add the target package first (app-admin/salt in our case) and successively add only eclasses actually required, resolving [XXXXX].eclass could not be found by inherit() errors individually. This will prevent us from pulling in unrelated dependencies which would unnecessary complicate our work on Salt.

Yes, this is exactly what I did.

  • The error message you've raised in the issue summary, there are no ebuilds to satisfy "app-crypt/gnupg[nls]" for /build/amd64-usr/., is discussed in the developer docs (as emerge: there are no ebuilds to satisfy "<group>/<package>:=" for /build/amd64-usr/.) and resolving it (by carefully and successively adding all dependencies) is a regular part of adding complex packages.

Yes, I read that documentation too. The issue is that there is already gnupg in the base image. But in some way that your documentation doesn't cover, it is incompatible with the version that the dependencies of salt need. As I am not a gentoo dev, but from what I could tell, the issue is that the bundled version of gnupg was build without nls support, and the salt dependencies somehow trace down to needing one with nls. At which point trying to add that triggers a massive chain that ties to recompile systemd and everything else. As I believe @chewi explains, this has to do with decisions made to mask nls from Flatcar, and the solving of this issue is most certainly beyond both the documentation, and my abilities. And hence, I am at the wall that brought me to open the ticket.

Carefully following the "adding a package" section in our developer docs should get you there; please let us know if you're still interested on distro work or if we should rather close this issue.

Yes, carefully following those instructions did get me here. I didn't just add 40 extra copy statements to the script by picking random package names from the air... All of those packages were discorved, one at a time, by me trying to run the build, finding the next dependency that was missign, adding that 1, releat, repeat, repeat.

That's why I'm here, opening this ticket, to explain that there's a problem that cannot be solved by any number of people telling me to go read the documentation. I realize no one seems to believe me that I've read the documentation, but I did. Over and over. I followed the instructions to the letter, more than once. No amount of links to the same docs I already followed is going to get this issue resolved.

And alas, no, I am no longer interested in using Flatcar. Multiple times in this ticket I explained that I did indeed both read and follow all documentation on how to extend flatcar and followed every possible route to get my need met before resorting to filing this ticket. And while everyone did reply quite politely, they didn't seem willing to believe that I completely and 100% understood what a small base image meant and that I really, truly, had indeed actually read the documentation. I've gotten several very polite people pointing me again and again at the documentation I had already exhausted before I came here. And while I appreciate the politeness of everyone who found various ways to tell me to go read the documentation, ultimately that wasn't helpful. Salt is still uncompiled. And I'm waving the white flag and moving on.

I suspect like most projects, y'all get a lot of issues from folks who don't read the docs. So it can become very quick to assume the user hasn't and jump to conclusions and start cut'n'pasting links at them. But, sometimes, we really have read them.

dragonpaw commented 1 week ago

P.S. If you wanted to make building sysext's easier on yourself and everyone else, I suspect something like this would help a LOT of use cases:

#!/usr/bin/bash 

SYSEXT=$1

# Start qemu with overlay fs to save changes
qemu $LOTS_OF_OPTIONS -o backing_file=$SYSEXT.raw,backing_fmt=raw -f qcow2 flatcar.cow
# Let the user install things int he usual way, with pip install, go get, cargo, etc, whatever. /usr appears writable within the VM, but is actually writing to the overlay's raw file instead
shutdown

# Now just convert raw to cramfs_or_whatever
...

At which point you've uncoupled the logic for installing whatever, from the automation of capturing the changes those commands make, and you can use this for any kind of installer script. Pass the installer script name as an argument to the wrapper above, and it would stamp out sysexts fairly well I suspect.

Or maybe not. Was just a thought I had that I never got around to trying.

t-lo commented 1 week ago

Hi @dragonpaw,

Yes, and I read that documentation. Multiple times. I fully understand the idea of keeping base as small as possible. And yes I fully agree with the idea that what I am doing is not something many/most will want in their base image. But it is what I need.

That's fine - all I'm saying is that we're not talking about apt-get install salt level of difficulty, and this is mostly about expectation management. In order to reach our goal of shipping Salt in a custom image, we need to

What we're doing here is slightly outside of the regular scope of the distro - as stated, we focus on minimal footprint, and python alone is a deal-breaker here - so there is no explicit documentation for this. That said, several maintainers tuned in here to help, as our challenges with this work are not unlike the objectives regular distro maintainers pursue to keep Flatcar up to date (though usually focusing on different packages than Python or Python apps).

Yes, this is exactly what I did. And it didn't work. And that's why I opened the ticket.

I feel we still have a disconnect regarding the perceived level of difficulty of our chosen task. But I also believe we can overcome it, as stated before, if you're still interested.

I copied the class only when, as indicated in the written instructions, that the class wasn't resolving properly. In this case, the app-admin/salt references a newer version of Python than in is the Python class files, and so it was impossible to progress without this copy statement.

Wildcard copies will create more issues down the road because eclasses unrelated to the Salt dependencies introduce more dependencies on other eclasses. In the case at hand, we learned that we will need to ship an upgraded Python version to make Salt work - that's fine. Let's pick the new version, and iteratively add dependencies.

Yes, I read that documentation too. The issue is that there is already gnupg in the base image. But in some way that your documentation doesn't cover, it is incompatible with the version that the dependencies of salt need.

There is no documentation about this because it's distro maintenance. You read ebuild files, try emerge builds, fix errors, and go on. Once again: we're developing slightly outside of the distro's scope. Putting in this level of effort should be expected. And we're here to support you.

In the case at hand, we either ship a better version of gnutls, with NLS enabled (which will grow the size of your base image and pull in even more dependencies) or we disable NLS as suggested by Chewi (which is the path forward I too would recommend). Since this initiative is led by you, it's your choice to make.

Yes, carefully following those instructions did get me here. I didn't just add 40 extra copy statements to the script by picking random package names from the air... All of those packages were discorved, one at a time, by me trying to run the build, finding the next dependency that was missign, adding that 1, releat, repeat, repeat. [...] And alas, no, I am no longer interested in using Flatcar. Multiple times in this ticket I explained that I did indeed both read and follow all documentation on how to extend flatcar and followed every possible route to get my need met before resorting to filing this ticket.

There's not much I can say to this except "welcome to distro maintenance". Because that's what our chosen task entails. Which, if you ask me, is fine. You're chatting to Flatcar distro maintainers here, we feel your pain because what we do here is core maintenance work (albeit, as stated, we usually focus on different packages). It's been rather hard to tell from your description what exactly you tried, how far you got, and what stopped you (except for the NLS issue, which by now should be resolved). It's obvious that it was a frustrating experience for you, and we deeply understand because, as stated, as distro maintainers we do the exact same work - but we do it as a team .

So yes, it can be very frustrating to work alone on this - but as you might have noticed from this ticket, we'll happily work closer with you to help you succeed though if you chose to pursue this further. The only thing we can offer at this point is to look forward - the NLS issue is solved, let's see how far this can be pushed now. If you're still interested.