NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.65k stars 13.8k forks source link

AWS stage-1 xen modprobe errors #46881

Open coretemp opened 6 years ago

coretemp commented 6 years ago

Issue description

[    0.245401] stage-1-init: modprobe: ERROR: could not insert 'xen_blkfront': No such device
[    0.247765] stage-1-init: loading module xen-netfront...
[    0.258296] stage-1-init: modprobe: ERROR: could not insert 'xen_netfront': No such device
[    0.260731] stage-1-init: loading module fuse...

The machine works otherwise.

Steps to reproduce

Use a t3.nano machine as a deployment target.

Technical details

stale[bot] commented 4 years ago

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.
marcosrdac commented 1 year ago

I have the same problem. Did you manage to solve it?

arianvp commented 4 months ago

The module loads fine on aarch64 but fails to load on x64_64. odd

First of all. xen-blkfront has a proper alias set up; so it should be automatically loaded by udev anyway. no need for us to explicitly load it

filename:       /run/booted-system/kernel-modules/lib/modules/6.6.31/kernel/drivers/block/xen-blkfront.ko.xz
alias:          xenblk
alias:          xen:vbd
alias:          block-major-202-*
license:        GPL
description:    Xen virtual block device frontend
depends:
retpoline:      Y
intree:         Y
name:           xen_blkfront
vermagic:       6.6.31 SMP preempt mod_unload
parm:           max_indirect_segments:Maximum amount of segments in indirect requests (default is 32) (uint)
parm:           max_queues:Maximum number of hardware queues/rings used per virtual disk (uint)
parm:           max_ring_page_order:Maximum order of pages to be used for the shared ring (int)
parm:           trusted:Is the backend trusted (bool)
parm:           feature_persistent:Enables the persistent grants feature (bool)
arianvp commented 4 months ago

Digging into the linux source code, the kernel deriver returns -ENODEV here:

https://github.com/torvalds/linux/blob/c760b3725e52403dc1b28644fb09c47a83cacea6/drivers/block/xen-blkfront.c#L2597

Which is either true or implemented based on CONFIG_XEN_PVHVM

https://github.com/torvalds/linux/blob/c760b3725e52403dc1b28644fb09c47a83cacea6/include/xen/platform_pci.h#L52

Which in turn is only implemented for x86:

https://github.com/torvalds/linux/blob/c760b3725e52403dc1b28644fb09c47a83cacea6/arch/x86/xen/platform-pci-unplug.c#L107

arianvp commented 4 months ago

I don't really understand why the kernel driver succeeds to load on aarch64 and doesn't succeed to load on x86.

But I don't think Modern AWS even uses Xen for disk access anymore. The Nitro hypervisor exposes EBS volumes as NVME drivers to the host. /dev/xvd* are aliases to /dev/nvme*

I think we can just remove this kernel module completely.

arianvp commented 4 months ago

Okay so there are still instances that use xen (t2.micro for example). However I am 99% sure that udev will load the kernel driver for us and we don't need to manually load it at all. I'll make a PR and verify it on a t2.micro instance.