Open spietras opened 1 year ago
Hey there, original author of the patch which broke your usecase, apologies for this, we tested it over nixpkgs but no one had usecases like yours, so I am discovering it.
My intuition is the following: make-disk-image
is responsible for that assert error, it has no way to know you want to pass a GPT partition table because you are doing it in a string fashion in the initrd and useDefaultFilesystems pass none
as partition table.
Obviously, now the semantic question is: should useDefaultFilesystems = false;
still take care of the default partition table?
If the answer is yes, then we probably want a useDefaultPartitionTable
option in the future and make it clear that partition table setup is assured by the QEMU test infrastructure, you only have to deal with partitions.
If the answer is no, I am not really certain what is the best way. Clearly, the user would need to provide the system image in that case and fill out all the blanks for the exact usecase they want to have.
It might be much more complicated because some of them relies on knowledge on what is a closure information is, etc. But it would provide maximum flexibility towards those usecases.
https://github.com/NixOS/nixpkgs/pull/228734 is the implementation of the "the answer is yes" branch of my explanation.
Tested with
{
inputs.nixpkgs = {
type = "github";
owner = "NixOS";
repo = "nixpkgs";
rev = "84966c085e2b8fe55748959f4f2fc5957f937d28";
# it works with the commit below
#rev = "13ea5dc163f5abde5ed5954b75179eee7c420a8e";
};
outputs = inputs: {
nixosConfigurations = {
foo = let
system = "x86_64-linux";
pkgs = import inputs.nixpkgs {inherit system;};
in
inputs.nixpkgs.lib.nixosSystem {
system = system;
modules = [
{
boot.loader.systemd-boot.enable = true;
system.stateVersion = "23.05";
users.users.root.password = "";
virtualisation.vmVariantWithBootLoader = {
boot.initrd.postDeviceCommands = ''
${pkgs.parted}/bin/parted --script /dev/vda -- mklabel gpt mkpart root 0% 100%
${pkgs.e2fsprogs}/bin/mkfs.ext4 -L root /dev/disk/by-partlabel/root
'';
virtualisation = {
fileSystems."/" = {
device = "/dev/disk/by-label/root";
fsType = "ext4";
neededForBoot = true;
};
useDefaultFilesystems = false;
useEFIBoot = true;
};
};
}
];
};
};
};
}
This is going probably too far, but if I would take a shot at redesigning the whole virtual machine stuff related to a NixOS configuration from an ease-of-use point of view, something like this would probably give users a lot of flexibility:
{
outputs = inputs: {
nixosConfigurations = {
foo = inputs.nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
# Target system configuration
systemModules = [
(
# Pass only system configuration as an input
{config, ...}: {
# Concerning virtualisation possibilities inside the target system
virtualisation = {
docker.enable = true;
};
}
)
];
# Virtual machine variant of the system for outside usage
vmModules = [
# You can override the system configuration here
(
# Pass both system and vm configurations as inputs
{
systemConfig,
vmConfig,
...
}: {
networking.hostName = "${systemConfig.networking.hostName}-vm";
}
)
# And you can define virtual machine configuration inside 'vm' attribute
{
vm = {
memorySize = 2048;
# Each disk image can be created with qemu-img
# Then parted can be used to create partitions on the image
# Then we can mount the image inside a temporary virtual machine and run a script to set up the filesystems
# And finally we can attach the images as virtio drives in qemu for the target virtual machine
diskImages = [
# Position in the list is the drive index, so this is /dev/vda (next one would be /dev/vdb)
{
# Options to use in qemu-img create
size = "1G";
format = "qcow2";
options = {
preallocation = "full";
};
# This is the path on the host system
# It is persisted between runs
# But if the configuration changes, it will be replaced with a new image
file = "disk.qcow2";
# As in mklabel in parted
partitionTable = "gpt";
# Each one translates to a mkpart in parted
# They will be executed in order so each one gets its own predictable number
partitions = [
# We know this is /dev/vda1 (and also /dev/disk/by-partlabel/boot since we're using GPT)
{
name = "boot";
filesystem = "fat32";
start = "1MB";
end = "512MB";
flags = ["boot" "esp"];
}
# And this is /dev/vda2 (and also /dev/disk/by-partlabel/root since we're using GPT)
{
name = "root";
filesystem = "ext4";
start = "512MB";
end = "100%";
}
];
}
];
# Additionally, make it possible to attach any host drives
hostDrives = [
# CD-ROM drive from the host
{
# Translates to -drive file=/dev/cdrom,media=cdrom in qemu
file = "/dev/cdrom";
media = "cdrom";
}
];
# Run a script inside a temporary virtual machine to set up the filesystems
# I guess we can use vmTools.runInLinuxVM for this and attach the disk images as virtio drives (same as the target virtual machine)
# We could use a more static configuration instead of a script, but it's hard to cover all cases (e.g. ZFS)
# Using a script we can just run whatever commands we want
filesystemsSetup = ''
mkfs.fat -F 32 -n boot /dev/vda1 # or /dev/disk/by-partlabel/boot
mkfs.ext4 -L root /dev/vda2 # or /dev/disk/by-partlabel/root
'';
# I don't know much about bootloaders
# But I guess this is enough info to install one
bootDevice = "/dev/vda";
bootPartition = "/dev/vda1";
};
}
];
};
};
};
}
And then just be able to use this to run the virtual machine:
nix run .#nixosConfigurations.foo.vm
This is not necessary, but the reason I moved the virtual machine configuration outside of the usual modules is that:
#nixosConfigurations.foo.vm
instead of #nixosConfigurations.foo.config.system.build.vm
(And I think that should go deeper than just vm
, why is system.build
inside a config
? The word config
implies you can find static options there which are the input to some process, and what is in system.build
is obviously an output of that process).But this would require a lot of changes and I probably overlooked many issues with this approach. But for sure, there needs to be a way to give users more flexibility with partitioning the disk(s).
For now, I just dropped the bootloader and started to use vmVariant
instead of vmVariantWithBootLoader
. The system boots directly and I can set up the partitions and filesystems in initrd
. Can't test the bootloader this way, but it's not that much of a loss for me.
There's a lot to unpack in your message, apologies if I don't answer everything.
Firstly, I don't use flakes, and they are experimental, so your top-level API is really specific to nixosSystems I believe, so I think you'd have to report this separately to get it this way.
Virtual machine configuration is not a part of the system configuration, it's on another layer.
This is complicated, do we want to have makeItAVM :: NixOSConfig -> NixOSConfig
or makeItAVM :: NixOSConfig -> VMConfig
, etc.
Modelling this properly is still an open problem IMHO.
diskImages
,hostDrives
The first two are list-driven APIs, it's unfortunately a bad idea IME.
NixOS modules can perform spooky interaction at distance, you cannot predict the order of your disks, therefore you cannot have reliable tests (or anything).
I am in the process of killing and deprecating APIs, we should rather use attrset-driven APIs.
diskImages."root" = ...
partitions."vda" = ...
Look at how https://github.com/nix-community/disko works for example.
filesystemsSetup
Ideally, we should avoid string-driven API because they do not carry any structured information so that internal libraries can use them to do smart things.
I'd prefer much more structured things, why not plug disko into filesystems and let it drive the partitioning/mounting correctly for example.
For now, I just dropped the bootloader and started to use vmVariant instead of vmVariantWithBootLoader. The system boots directly and I can set up the partitions and filesystems in initrd. Can't test the bootloader this way, but it's not that much of a loss for me.
Did you try my PR? If that's enough for you, and you are not interested into it, let's close this issue because it's not actionable anymore IMHO.
This is complicated, do we want to have
makeItAVM :: NixOSConfig -> NixOSConfig
ormakeItAVM :: NixOSConfig -> VMConfig
, etc.Modelling this properly is still an open problem IMHO.
I agree. But the way it works now just feels kinda messy to me.
NixOS modules can perform spooky interaction at distance, you cannot predict the order of your disks, therefore you cannot have reliable tests (or anything).
I am in the process of killing and deprecating APIs, we should rather use attrset-driven APIs.
diskImages."root" = ...
partitions."vda" = ...
I would be happy with whatever that works. I just tried to make a data model that enforces the order of items. While using virtio
drives in qemu
we can't specify how to device will be called, it's based on the order, so with ordered items we always know what devices they map to (e.g. first item will be /dev/vda
). I guess that with attrsets there would be no way to enforce that a given disk should be at /dev/vda
, it would be random or based on some sorting of keys.
I'd prefer much more structured things, why not plug disko into filesystems and let it drive the partitioning/mounting correctly for example.
I have never used disko
, but it seems like it's able to deal with a lot of different configurations, so if it's somehow possible to use it to set up partitions and filesystems for vm image then it would be awesome. However, I'm afraid that it assumes that the user knows beforehand how the devices are called (viewing them from inside a NixOS installer) and with virtual machines there are no pre-existing devices, they need to be created based on our configuration. And even if declarative management can deal with a lot of situations, there will always be some exceptions. So I think that there should always be a possibility for the user to do things manually his own way. "Simple things should be simple, complex things should be possible".
Did you try my PR? If that's enough for you, and you are not interested into it, let's close this issue because it's not actionable anymore IMHO.
I tried it, but it only boots me to UEFI Interactive Shell
with useDefaultPartitionTable = true
. But one way or another, I can't really use the default partition layout, because my target layout is different. And with useDefaultPartitionTable = false
it's the same story as before.
I guess we can close this for now. I'm sure someone will pick it up in the future because it's very useful to be able to reproduce your custom system as close to 1:1 as possible in the virtual machine. But it seems that a lot needs to be changed, discussed and agreed upon to make it possible.
This is complicated, do we want to have
makeItAVM :: NixOSConfig -> NixOSConfig
ormakeItAVM :: NixOSConfig -> VMConfig
, etc. Modelling this properly is still an open problem IMHO.I agree. But the way it works now just feels kinda messy to me.
Unfortunately, untangling all of this requires time.
NixOS modules can perform spooky interaction at distance, you cannot predict the order of your disks, therefore you cannot have reliable tests (or anything). I am in the process of killing and deprecating APIs, we should rather use attrset-driven APIs.
diskImages."root" = ...
partitions."vda" = ...
I would be happy with whatever that works. I just tried to make a data model that enforces the order of items. While using
virtio
drives inqemu
we can't specify how to device will be called, it's based on the order, so with ordered items we always know what devices they map to (e.g. first item will be/dev/vda
). I guess that with attrsets there would be no way to enforce that a given disk should be at/dev/vda
, it would be random or based on some sorting of keys.
It's not desirable to depend on the ordering of your disks, when you have multiple layers of abstraction, something can insert a disk before you and after you and all of your tests depending on vda, vdb, vdc, vdd, vde needs to be shifted by one disk or more.
Unless you have a compelling case, I never find any use of having ordering for disks, you want to give them names and reference them through their names.
I'd prefer much more structured things, why not plug disko into filesystems and let it drive the partitioning/mounting correctly for example.
I have never used
disko
, but it seems like it's able to deal with a lot of different configurations, so if it's somehow possible to use it to set up partitions and filesystems for vm image then it would be awesome. However, I'm afraid that it assumes that the user knows beforehand how the devices are called (viewing them from inside a NixOS installer) and with virtual machines there are no pre-existing devices, they need to be created based on our configuration. And even if declarative management can deal with a lot of situations, there will always be some exceptions. So I think that there should always be a possibility for the user to do things manually his own way. "Simple things should be simple, complex things should be possible".Did you try my PR? If that's enough for you, and you are not interested into it, let's close this issue because it's not actionable anymore IMHO.
I tried it, but it only boots me to
UEFI Interactive Shell
withuseDefaultPartitionTable = true
. But one way or another, I can't really use the default partition layout, because my target layout is different. And withuseDefaultPartitionTable = false
it's the same story as before.
Can you detail more about your usecase? Because your example code mentioned a GPT partition table and UEFI boot. Are you trying to test legacy protective MBR partition table or stuff like that?
I guess we can close this for now. I'm sure someone will pick it up in the future because it's very useful to be able to reproduce your custom system as close to 1:1 as possible in the virtual machine. But it seems that a lot needs to be changed, discussed and agreed upon to make it possible.
I would appreciate any help anyway. :)
It's not desirable to depend on the ordering of your disks, when you have multiple layers of abstraction, something can insert a disk before you and after you and all of your tests depending on vda, vdb, vdc, vdd, vde needs to be shifted by one disk or more.
Unless you have a compelling case, I never find any use of having ordering for disks, you want to give them names and reference them through their names.
I agree, it would be best to be able to explicitly name the disk. But if we are using qemu
for the virtual machine and creating virtual disks, then I think we have no option to enforce the disk name. It's not our choice, it's qemu
's design. But I'm not 100% sure about that. If it is possible to bind the name, then I'm 100% in favour of relying only on the name.
Can you detail more about your usecase? Because your example code mentioned a GPT partition table and UEFI boot. Are you trying to test legacy protective MBR partition table or stuff like that?
It's not that much different from the default setup. I just want to have one disk and possibly more partitions. The simplest example would be boot
partition (for /boot
), root
partition (for /
), home
partition (for /home
) and swap
partition (for swap). And the reason I want more partitions inside the virtual machine is that I use more partitions in my physical machine setup and I want to match the two configurations as close as possible.
It's not desirable to depend on the ordering of your disks, when you have multiple layers of abstraction, something can insert a disk before you and after you and all of your tests depending on vda, vdb, vdc, vdd, vde needs to be shifted by one disk or more. Unless you have a compelling case, I never find any use of having ordering for disks, you want to give them names and reference them through their names.
I agree, it would be best to be able to explicitly name the disk. But if we are using
qemu
for the virtual machine and creating virtual disks, then I think we have no option to enforce the disk name. It's not our choice, it'sqemu
's design. But I'm not 100% sure about that. If it is possible to bind the name, then I'm 100% in favour of relying only on the name.
It's not a QEMU limitation, you never use disks before udev
has kicked in, therefore you can always use udev
to rename them.
It's our choice to not have those APIs.
Can you detail more about your usecase? Because your example code mentioned a GPT partition table and UEFI boot. Are you trying to test legacy protective MBR partition table or stuff like that?
It's not that much different from the default setup. I just want to have one disk and possibly more partitions. The simplest example would be
boot
partition (for/boot
),root
partition (for/
),home
partition (for/home
) andswap
partition (for swap). And the reason I want more partitions inside the virtual machine is that I use more partitions in my physical machine setup and I want to match the two configurations as close as possible.
You are mentioning partitions, but I still don't see the need for partition table at the moment.
Note that also the "test VM with bootloader" is not devised to test your filesystem mapping, we cannot do this because we do not have the declarative information of your partitioning. Some projects existed in the past to achieve this (nixpart for example).
Right now, the best bet is disko
, if it becomes part of NixOS, it is possible to invent a mode where we partition a layout, install NixOS in it and test your VM completely.
In the meantime, this usecase (testing your filesystem mapping) is unsupported and writing custom code yourself to make it work is probably not a good idea because you are not testing the filesystem mapping, you are testing you wrote the correct code to put your fs in the right situation and there's nothing that says when you reinstall this machine you will type the same commands.
Thus, disko
seems the way forward on this usecase.
Issue description
I'm using a custom filesystem layout (
tmpfs
on root +ZFS
for persistence, but that doesn't really matter). When installing my configuration on a new machine I use an installer script that creates partitions and filesystems. But before that, I want to test my configuration inside a qemu virtual machine on my development computer. To do that, I followed the tests in #178531 with slight modifications, defined my filesystems, setuseDefaultFilesystems
tofalse
and added commands inpostDeviceCommands
that basically mimic what the installer script is doing with setting up partitions and filesystems, except the boot partition.This approach worked for quite some time. The last commit that it works on is 13ea5dc163f5abde5ed5954b75179eee7c420a8e. If you use the configuration I provided below, the system boots correctly and you can run
lsblk -o NAME,FSTYPE,PARTLABEL,LABEL,SIZE,MOUNTPOINTS
to get something similar to:I don't really know the internals of what's happening, but it seems that there are two disks available in this situation with the boot partition being on the second disk. This is not the same situation as my target configuration, because I have only one disk and the boot partition is on the same disk as other partitions, but it allows you to test almost everything.
Now with the newest commit things don't work anymore. I get an error saying:
I investigated a little bit, and the commit that introduces the change that outputs the error is 76c7b656bfa9b20a4172f7901285560db4c2c695 by @RaitoBezarius. It seems that the whole virtual machine image-building workflow got an overhaul and my approach doesn't work anymore.
I feel that it's not a bug in the internal logic, but that my approach was wrong. Can someone point me in the right direction of what should I do to be able to use my custom filesystems inside a virtual machine with a bootloader?
Steps to reproduce
This is a minimal reproducible example. Create
flake.nix
:and run: