includeos / IncludeOS

A minimal, resource efficient unikernel for cloud services
https://www.includeos.org
Apache License 2.0
4.86k stars 358 forks source link

nix: create a new scope for IncludeOS packages #2236

Closed bjornfor closed 1 month ago

bjornfor commented 1 month ago

It's Eurovision Song Contest tonight, so I had to hack a bit in IncludeOS :-)

Commits:

alfreb commented 1 month ago

Nice! I need to read up on both overlays and scopes - but it looks like it has the effect of insulating the IncludeOS dependencies from picking up different package sets / stdenvs. So we will be less prone to mistakes when we add new dependencies.

If I now want to build a bootable binary that links against IncludeOS I would want that to be build with the same stdenv as the IncludeOS static libs. Would I use this overlay for that and do stdenv = pkgsIncludeOS.stdenv in the bootable's default.nix?

bjornfor commented 1 month ago

If I now want to build a bootable binary that links against IncludeOS I would want that to be build with the same stdenv as the IncludeOS static libs. Would I use this overlay for that and do stdenv = pkgsIncludeOS.stdenv in the bootable's default.nix?

Yes, I think so. But I'm not sure what the end goal is wrt. musl-includeos. Because that's not currently part of pkgsIncludeOS.stdenv.

Also, I would continue to add packages in the overlay now, not default.nix.

alfreb commented 1 month ago

If I now want to build a bootable binary that links against IncludeOS I would want that to be build with the same stdenv as the IncludeOS static libs. Would I use this overlay for that and do stdenv = pkgsIncludeOS.stdenv in the bootable's default.nix?

Yes, I think so. But I'm not sure what the end goal is wrt. musl-includeos. Because that's not currently part of pkgsIncludeOS.stdenv.

Also, I would continue to add packages in the overlay now, not default.nix.

Sounds good! The end goal with musl-includeos is that to produce a bootable binary with IncludeOS you have to link against musl-includeos and not vanilla musl from pkgsStatic. This because we patch musl so that it calls includeOS system calls as plain functions, instead of doing syscall and sysret instructions directly (this in the case of x86 - probably other instructions on other arches), which is what musl does by default. syscall switches into ring 0, and from there invokes the kernel. sysret switches "back" to ring 3 - which is a problem for IncludeOS, which is always in ring 0. So any sysret instructions will cause a kernel panic. Also, this is unnecessary overhead and instruction cache thrashing; IncludeOS is just another library and system calls are just function calls. For reference the IncludeOS system calls are implmented in the poorly named directory IncludeOS/src/musl/ - it should probably be called musl_backend. And going through these is really not the best use of IncludeOS; as you will see from the implementations (some are stubs) a native C++ application would be better off using the native IncludeOS C++ api directly. But we do this to become somewhat linux compatible and to be able to use musl (we used to use the more limited newlib from red hat).

So yes, it makes perfect sense that we now only add packages to pkgsIncludeOS for anything that needs to be a part of the bootable binary in the end (let's call such applications "IncludeOS bootables" or just bootables). But the intention is that those bootables will be created by users as standalone applications, which should depend on IncludeOS, but not necessarily be a part of it. And these will definitely also have to link against musl-includeos.

Another snag is that that bootables will need to be linked with a linker script, calling ld directly (lld didn't support linker scripts when I last tried - and you couldn't use linker scripts via the clang frontend). I've gotten this to work (I think) in a hack, by having an bootable's default.nix expose an environment variable with the path to musl-includeos, which cmake picks up and uses as it explicitly sets up the linking order. I'll try to get a PR with that up for reference. It's very hacky, so I'm sure we can develop something much nicer over time. Maybe an IncludeOS stdenv inside pkgsIncludeOS that uses musl-includeos. But I don't see how we can get away from using a linker script, so it's not going to be entirely straightforward to go that route either.

alfreb commented 1 month ago

@bjornfor see #2237 for an example bootable that links. We're still some ways away from booting - for one we need to reintegrate the vmbuild tooling, which is not build by nix. But at least this shows how we might get to a fully linked binary that might boot when getting a bootloader attached.

This example is in tree, which I think we should have and build by default in order to make sure we don't break dependencies all the way to a fully linked and bootable binary. But the default case should be that applications are developed for IncludeOS in spararation, as for instance the load balancer https://github.com/includeos/microlb or the firewall https://github.com/includeos/nacl . All suggestions welcome. And if @MagnusS agrees to the approach in this PR I'm happy to try to integrate my changes on top of yours. The question is then if we should add the example inside the overlay or outside. I think it's ok to have some examples in-tree and others out of tree and that there are two different approaches to use nix for those examples in- and out of tree.

bjornfor commented 1 month ago

Ok, so I guess pkgsIncludeOS should have the IncludeOS musl-based stdenv, not pkgsStatic.llvmPackages_16.libcxxStdenv?

And which of the IncludeOS deps actually need to be built statically? I found two deps that were needlessly built from pkgsStatic that I fixed in this PR.

I wonder if the end goal is pkgsIncludOS having two attributes:

The deps get built with pkgsStatic, including musl-includeos. And third party apps use pkgsIncludeOS.stdenv.mkDerivation to build their apps.

alfreb commented 1 month ago

Ok, so I guess pkgsIncludeOS should have the IncludeOS musl-based stdenv, not pkgsStatic.llvmPackages_16.libcxxStdenv?

And which of the IncludeOS deps actually need to be built statically? I found two deps that were needlessly built from pkgsStatic that I fixed in this PR.

I wonder if the end goal is pkgsIncludOS having two attributes:

  • stdenv -- a custom musl-based IncludeOS stdenv
  • IncludeOS example app

The deps get built with pkgsStatic, including musl-includeos. And third party apps use pkgsIncludeOS.stdenv.mkDerivation to build their apps.

I'm not sure we need to rebuild all the dependencies with musl-includeos. It should be enought that they are built against the same version of musl, and then that we inject musl-includeos only for linking. We don't patch the musl interface, only its system call layer, usually facing linux. I think it would save us a lot in build times and cache space if we could keep that as it is. We picked pkgsStatic.llvmPackages_16.libcxxStdenv becuse it already uses musl - includeos-musl is an older version, but that should be fixed. If we could ensure that we kept those two musls at the same version I think we might be ok.

To answer your question; there is no dynamic linking happening in IncludeOS (inside a vm or on bare metal). Every library used by the third party application has to be statically linked. But not the tools used for building, such as cmake, i.e. the nativeBuildInputs. IncludeOS is a single application OS and will only run a single program. The idea is simply to make an arbitrary program bootable as a VM (it also boots on bare metal x86, but we have almost no device drivers, so you can't do a whole lot with that as it stands).

So, should we merge this and then I integrate the example into the overlay, similar to how the deps are built?

bjornfor commented 1 month ago

I'm not sure we need to rebuild all the dependencies with musl-includeos.

Ok, then I think this PR is heading in the right direction. In the future we should move the deps out of pkgsIncludeOS, or introduce a new package set with the final stdenv and example app.

alfreb commented 1 month ago

I'm not sure we need to rebuild all the dependencies with musl-includeos.

Ok, then I think this PR is heading in the right direction. In the future we should move the deps out of pkgsIncludeOS, or introduce a new package set with the final stdenv and example app.

Right, initially I thought part of the point to have the dependencies in pkgsIncludeOS to ensure that others who want to use e.g. uzlib and GSL use compatible versions. But maybe your point is that it's cleaner if users only use what's upstream and that we don't pollute the package set with nonstandard stuff. I guess that makes sense. Botan, for example, is provided in nixpkgs already, but we have to use a pinned version because it takes work on our TLS implementation to adopt the recent changes. So we'll be behind. Other users might want to roll their own TLS connections and can then decide to do that using an unpinned botan. In that case we'd want pkgsIncludeOS to be packages we know can work with our libc and libc++, which should be anything built with our stdenv.

MagnusS commented 1 month ago

Thanks @bjornfor ! I think this looks like a better approach to make sure that we stay in the same environment for all packages. I guess this may also make it easier to write packages/applications that depend on includeos' build environment. I merged #2237 so we can test against what we have so far. This PR will need an update though as it introduced a conflict.

MagnusS commented 1 month ago

To rebase against master we have to move the example somewhere else - e.g. into example/default.nix. We could then build it via callPackage, but it would also be nice if we could just depend on IncludeOS from the example itself.

Do you think it would be possible with the approach from this PR to e.g. use pkgs ? import ../default.nix { } to build the example? I tried it this morning, but as we only return pkgsIncludeOS.includeos from ../default.nix I guess we don't have the full environment.

MagnusS commented 1 month ago

I submitted #2238 to fix the conflicts and add support for vmbuild