Closed copumpkin closed 9 years ago
If we come up with a nice way to do this, it could also subsume the current ec2-data.nix
we have in NixOS to get the SSH host key (which probably shouldn't be there either).
On the other hand, 16kb is actually quite a bit of space, and people who need more could probably incorporate their own fetchurl
calls to import.
I'm willing to experiment with making this better, but would appreciate some guidance on how to develop on the EC2 images. I'm currently calling ./create-ebs-amis.py --region us-east-1 --hvm
in the nixos maintainers scripts, and that gives me an AMI, but I don't think it necessarily spawns one from my <nixpkgs>
(it takes a channel argument) so I'm at a bit of a loss as to how to test changes incrementally. Anyone have any hints? This stuff isn't really documented anywhere as far as I can tell...
cc @rbvermaa @edolstra (not sure who else knows about this stuff)
Yes, EBS creation currently is done 'on ec2' and uses a channel. As Amazon EC2 now supports uploading an image for EBS (previously only possible for S3 backed images), we should use this import facility and rewrite the create-ebs-amis.py script to use that.
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/importing-your-volumes-into-amazon-ebs.html
I see, thanks! How did you test this stuff as you were writing it? I mostly want to build an image from a custom nixpkgs clone and don't have an easy setup with a channel.
On Tuesday, March 24, 2015, Rob Vermaas notifications@github.com wrote:
Yes, EBS creation currently is done 'on ec2' and uses a channel. As Amazon EC2 now supports uploading an image for EBS (previously only possible for S3 backed images), we should use this import facility and rewrite the create-ebs-amis.py script to use that.
— Reply to this email directly or view it on GitHub https://github.com/NixOS/nixpkgs/issues/6662#issuecomment-85392361.
My current status is that I've amended the service in ec2-data.nix
to inject a file imported from the default configuration.nix from user-data. It then populates the channel and runs nixos-rebuild switch
.
Unfortunately, I can't sensibly run nixos-rebuild switch
inside a service, because doing so shuts down running services and thus kills itself mid-operation.
I'm at a bit of a loss as to how to make it work nicely. It might make sense to have everything happen during boot.postBootCommands
, but I'm not sure if that'll cause other issues. Anyone have any ideas? @shlevy?
@copumpkin This kind of thing was one of the reasons motivating exploration of a NixOS alternative: https://github.com/zalora/defnix/issues/12
I should have some slides/notes available going into detail about defnix soon.
(to be clear: I think the NixOS all-or-nothing activation model + all-or-nothing evaluation model makes this kind of thing very difficult to add in)
That's interesting, thanks, but I don't yet see where the deep incompatibility comes in. What pain would I encounter by running nixos-rebuild switch
(or an equivalent if that fails due to unsatisfied assumptions) during postBootCommands
?
Are you going to modify configuration.nix? At the very least, that will break nixops, since there is no local configuration.nix by default and if it is it's not at all kept in sync with the nixops config. And automatic modification of manually maintained configuration files is generally tricky, what if you have multiple tools doing this and they step on each other, and your injection almost certainly will be ad-hoc.
When exactly should these changes happen? At activation time, at boot time? If at activation time, this will break nixos-rebuild test, since that activates the new config and is supposed to not permanently switch, yet you'll run nixos-rebuild switch and the new config will be activated permanently.
What about settings and environment of nixos-rebuild? If I run nixos-rebuild with NIX_PATH such tht I use my local nixpkgs checkout instead of the channel, will your postBootCommands pick that up? How?
This is just off the top of my head, there may be more specific issues. In general, I think the "evaluate the entire system statelessly" + "activate the entire system at once" + a strict separation between stages makes this kind of thing very difficult.
Yeah, the scheme I currently have:
configuration.nix
is a modified amazon-config.nix
:
{
imports = [ "amazon-image.nix" "/etc/nixos/amazon-init.nix" ];
}
and ec2-data.nix
changes its interpretation of user data to write it out to /etc/nixos/amazon-init.nix
and call nixos-rebuild switch
. Since the service is a one-shot persistent unit, it should only happen during first boot (like the current ec2-data
behavior). If I switch to postBootcommands
, I'd emulate the "one-shot" behavior with a touched file or similar.
When exactly should these changes happen? At activation time, at boot time? If at activation time, this will break nixos-rebuild test, since that activates the new config and is supposed to not permanently switch, yet you'll run nixos-rebuild switch and the new config will be activated permanently.
I'm not sure I understand. Following most distro user-data conventions, the script is supposed to run during first boot only and I just want it to behave as if I'd just typed in nixos-rebuild switch
by hand. Subsequent calls to nixos-rebuild
in the running system can do what they want.
At the very least, that will break nixops
The plan was to adjust how nixops works (since I can subsume the current behavior) or just fork the images so I can use autoscaling AMIs myself.
And automatic modification of manually maintained configuration files is generally tricky, what if you have multiple tools doing this and they step on each other, and your injection almost certainly will be ad-hoc.
I'm not sure what you mean about automatic modification of manual config, either. This is a freshly booted AMI with user-data specifying what people want to run on it. My goal is just to have the system match what the user-data asks for and do so on an ongoing basis (so if someone logs into the system and types nixos-rebuild switch
, the config shouldn't change from what got set at first boot unless someone changed configuration.nix
)
In effect, I want my boot-from-user-data to act like a (quick) brand new installation of NixOS. It doesn't seem that conceptually weird or counterintuitive, and I don't know how I could be getting conflicts with user-modified config with what I described.
Ah! I misunderstood. Yeah, OK if this is just about initial boot this is in principle fine, but you'll still need a way to get that information to systems like nixops that use NixOS without configuration.nix.
Yeah, NixOps currently assumes it can just dump a public/private (temporary, I hope!) SSH host key into user data with a certain format. To support the new image it would just need to change the format a little, although I'm still dubious about the practice of putting a host key into the user data in the first place. I don't think anything else would need to change.
Oh, I wasn't even thinking about that :smile: nixops also completely ignores configuration.nix, which is the issue I meant.
Oh, I see. We should talk on IRC sometime :smile:
@copumpkin The SSH host key sent via the user data is temporary: https://github.com/NixOS/nixops/commit/f6663b456a5eef3da5d5e5baa7e46ab33b236b04
@copumpkin It should be simple to distinguish between an old format and a new format for the userdata, so that both are supported without breaking backward compatibility with e.g. nixops.
Why should we change the format at all? It's just a bunch of name/value pairs, so it's possible to add new fields without breaking anything.
Great, even easier to keep backwards compatible then.
I'd just prefer not to deal with escaping, newlines, and stuff like that when we have a perfectly good format to use called nix :) but anyway, first I'll try to get it working and then we can figure out what format works best.
Also, I realized the host key is temporary but it still feels slightly wrong so someday I'd like to see if I can come up with a way to avoid putting it there. There's also the fact that autoscaling machines typically won't need NixOps to even SSH into them promptly so the two uses of user data are unlikely to overlap.
On Mar 25, 2015, at 05:54, Eelco Dolstra notifications@github.com wrote:
Why should we change the format at all? It's just a bunch of name/value pairs, so it's possible to add new fields without breaking anything.
— Reply to this email directly or view it on GitHub.
As a reminder, this is the logic I want to put somewhere:
amazon-init.nix
) from userdatanixos-rebuild switch
I have a bit of a conundrum:
ec2-data.nix
puts its current userdata processor), running nixos-rebuild switch
kills the service running nixos-rebuild
.It seems like what I need is some notion of a "fire-and-forget" systemd service that can depend on ip-up
and also run nixos-rebuild
without killing itself. I haven't used systemd much but I'm wondering if I can somehow detach or nohup
the actual nixos-rebuild
call from within it.
As always, I'm open to ideas or suggestions to save myself from going down the wrong path!
You can prevent a unit from being restarted by setting restartIfChanged = false
.
Oh! I tried stopIfChanged = false
and that didn't work, but didn't notice restartIfChanged
! I'll give that a go, thanks :smile:
Nope, restartIfChanged = false
still doesn't help:
$ systemctl cat fetch-ec2-data.service | grep RestartIfChanged
X-RestartIfChanged=false
$ journalctl -u fetch-ec2-data.service | tail -n3
systemd[1]: Stopping Fetch EC2 Data...
fetch-ec2-data-start[1617]: /nix/var/nix/profiles/per-user/root/channels/nixos/nixpkgs/nixos/modules/installer/tools/nixos-rebuild.sh: line 1: 11484 Terminated $pathToConfig/bin/switch-to-configuration "$action"
systemd[1]: Stopped Fetch EC2 Data.
:frowning:
I think the issue there is that the skip logic for X-RestartIfChanged
(in switch-to-configuration.pl
) is guarded behind checking if the path my $prevUnitFile = "/etc/systemd/system/$baseUnit";
exists, and it doesn't exist yet at the point I'm running.
Great success! It's currently hacked up and I don't have time to clean it up, but I have a basic proof-of-concept working and will polish it up and put it up somewhere for someone to dissect :smile: :smile: :grinning:
@copumpkin Did you put this up somewhere?
@nyarly the EC2 images support it now, but we need to wait for a release (any day now!) until they go live.
@rbvermaa is there a clean way for hydra to publish AMIs for its builds? it's a "slightly impure" operation, but would be pretty cool.
@copumpkin - I looked at the AMIs for 15.09 and they seem to have the amazon-image.nix file in their pages, but not their configuration.nix - is this coming soon? Is there an easy way to build my own userdata-capable image?
@nyarly I don't think amazon-image.nix needs to be in the configuration.nix. Have you tried booting the 15.09 images with userdata? I haven't tested the feature in a while but it worked last time I tried.
@copumpkin Sorry for the late followup: I just tried using an existing configuration.nix as the userdata to a fresh 15.09 instance, an I no dice. The new instance comes up with the stock configuration.nix, with the intended one appearing in /root/user-data. Am I doing something wrong?
@nyarly it might have broken since I merged it. I had a VM test for it but it broke and got turned off, and I haven't had time to fix it. Will check again at some point. Sorry!
I know that lots of people use NixOps, but it'd also be nice to support EC2 autoscaling groups properly. To do that, we'd probably want some way to inject
configuration.nix
and channels into the machine via EC2 user-data. The obvious thing to do would be to just dumpconfiguration.nix
into the user-data, but unfortunately the field is limited to 16kb.Anyone have ideas on nice ways to do this? I'm putting it here because it seems largely like a NixOS feature independent of NixOps.