scaleway / image-builder

:triangular_ruler: build server images on Scaleway
MIT License
66 stars 9 forks source link

BTRFS #12

Open faddat opened 8 years ago

faddat commented 8 years ago

Hi,

1) Been having some issues with Docker lately 2) Would like to use BTRFS in my images because of #1 and because BTRFS=better.

There are many ways I could do this, unfortunately most have failed. Do you have a recommendation on a best way to do this?

moul commented 8 years ago

Hi @faddat,

You can try this:

  1. service stop docker
  2. update the docker configuration in /etc/default/docker
  3. rm -rf /var/lib/docker
  4. service start docker

(based on https://docs.docker.com/engine/userguide/storagedriver/selectadriver/)

moul commented 8 years ago
  1. stop your server
  2. attach a secondary nbd device
  3. boot
  4. mkfs.btrfs /dev/nbd1
  5. service stop docker
  6. update the docker configuration in /etc/default/docker
  7. rm -rf /var/lib/docker
  8. `mount /dev/nbd1 /var/lib/docker
  9. service start docker
faddat commented 8 years ago

Manfred,

Sorry, but this misses the point of what we wish to do, which is specifically format the / as btrfs, and use that without a second disk.

Thoughts? Everything I've tried has blown up on me.

Jacob Gadikian E-mail: faddat@gmail.com SKYPE: faddat Phone/SMS: +84 167 789 6421

On Thu, Jun 23, 2016 at 3:44 PM, Manfred Touron notifications@github.com wrote:

  1. stop your server
  2. attach a secondary nbd device
  3. boot
  4. mkfs.btrfs /dev/nbd1
  5. service stop docker
  6. update the docker configuration in /etc/default/docker
  7. rm -rf /var/lib/docker
  8. `mount /dev/nbd1 /var/lib/docker
  9. service start docker

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/scaleway/image-builder/issues/12#issuecomment-227986774, or mute the thread https://github.com/notifications/unsubscribe/AGz6iaaEyqfmnx_LT4SIHGHVDhbNS6n0ks5qOkdQgaJpZM4I5QdG .

faddat commented 8 years ago

so, this is kinda many months later

but-- I'd still like to know how to go about doing this. I'm going to try a deboostrap on the secondary disk after formatting it btrfs and then saving it as an image. Think that will work?

moul commented 8 years ago

It should work. However, your bare debootstrap will probably miss some important scripts, see https://github.com/scaleway/image-ubuntu/blob/master/16.04/Dockerfile#L2 for examples of how we customize the debootstrap


In my opinion, it should be a lot easier to use the helpers we added in the initrd; we already have a "live" mode that works like a "live cd", and it supports BTRFS (see https://github.com/scaleway/initrd/blob/master/Linux/tree-common/boot-live).

@QuentinPerez can you provide to @faddat a scw command that runs a server, formats the rootfs in BTRFS and live-installs a rootfs tarball?

faddat commented 8 years ago

Wow that'd be qute a command, if it existed.... I'm trying your "livecd" now. That is a pretty cool concept.

faddat commented 8 years ago

Also.... I keep seeing references made to MIPS..... should this be taken as a hint?

QuentinPerez commented 8 years ago

Hi @faddat,

The following commands, start a fresh xenial server on VC1S, drop a shell in the initrd, backup your rootfs, format /dev/vda in BTRFS and restore your rootfs.

$ scw run -d --name test-btrfs -e "INITRD_DROPBEAR=1" xenial
$ scw exec -w test-btrfs "mkdir -p /tmp/cpy; cp -pr /newroot /tmp/cpy; chroot /tmp/cpy/newroot/ /usr/local/sbin/scw-sync-kernel-modules ; cp -r /tmp/cpy/newroot/lib/modules/ /lib; depmod -a ; mknod /dev/btrfs-control c 10 234 ; umount /dev/vda; mkfs.btrfs -f /dev/vda; mount /dev/vda /newroot/; cp -rp /tmp/cpy/newroot/* /newroot/; continue-boot"
$ scw exec -w test-btrfs "mount | grep btrfs"
/dev/vda on / type btrfs (rw,relatime,space_cache,subvolid=5,subvol=/)
stuart12 commented 7 years ago

@QuentinPerez this appears to work. I can create and login to the new server test-btrfs with / on btrfs but when I reboot, the server cannot mount /. It appears that the kernel/initrd after the reboot does not have btrfs available. Have you tried a reboot after having run the above commands? Thanks!

firegurafiku commented 7 years ago

@moul @QuentinPerez

Let me bump the question up. I've just tried to live-convert root partition from Ext4 to Btrfs with the pivot_root approach (following instructions from this great answer on StackExchange). I managed to convert the filesystem, but got the following error message in boot log:

>>> Initializing 'local' root file system...
>>> Attaching nbd0...
>>> Mounting nbd0 root...
>>> Mounting /newroot...
>>> 'mount /dev/nbd0 /newroot' failed

Are there any options to work around the issue?

Update:

I changed bootscript to armv7l 4.10.8 docker #1 in the machine's advanced settings, and everything started up like a charm!

@faddat

If your problem is still open, this setting may solve it too.

berezins commented 6 years ago

But no luck with VC1S with different bootscripts tried (mainline 4.4, 4.14, docker 4.10 at least). Tried to mount btrfs-formated disk also from initrd shell but it just refuses to mount with strange "No such device" or "Invalid parameter" error, even disk formated with mkfs.btrfs directly from that same initrd shell. While ext4 disks are mounted successfuly from same shell when same disk mkfs.ext4 formatted from there. It looks like there apparently not included something related to btrfs in x86_64 initrd. Did not try armv71 yet just to compare though it's not a solution for me since I need run x86_64 images on Scaleway with BTRFS.

berezins commented 6 years ago

Just tested on arm7l with docker 4.10 bootscript and it booted perfectly with btrfs root. So it looks like it only works with armv71 docker initrd and need to be fixed on x86_64 apparently. BTRW, earlier (more than 1 year ago or so) it booted for me on x86_64 as well I remember exactly with some of bootscripts. But then I did not use it since that time so not sure when it stoped to work while now I need it again

berezins commented 6 years ago

Now figured out that armv7l docker 4.10 kernel simple includes btrfs module built into kernel image (i.e. into vmlinuz file itself) itself, not as separate btrfs.ko module file, as can be seen by line CONFIG_BTRFS_FS=y from zcat /proc/config.gz | less when booted with armv7l docker 4.10 kernel, while all x86_64 does not ( CONFIG_BTRFS_FS=m when booted with any x86_64 kernel, i.e. there it is already as btrfs.ko module file which is not loaded in normal ).

So I am wondered that while BTRFS is built successfuly built into armv7l docker 4.10 kernel image without any problem then why not to built it in all other kernel images since it's quite litghweight by size and is not going to create any problems or side effects while would provide really lot of benefits for lots of Scaleway customer as I see according to questions and requests around BTRFS root/boot?

I also tried to fix this problem as below (which could be alternative solutions of course) https://github.com/tetatetit/initrd/commit/c4b3d88493f9d64204e0d94f68da6ce2ac1499fe5 but this is quite less reliable solution then builting into kernel image like with arm7l docker IMO, since different kernels have different set/naming of btrfs.ko dependencies so e.g. for mainline 4.15 function dependencies provided in load_btrfs_ko are missing/incorrect so extra checks for kernel version and providing different set of dependencies is needed with this approach which requeries regular support while just setting CONFIG_BTRFS_FS=y for all kernels just point of one line and no support

So I am eagerly looking forward for applying any of my fix to Scaleway kernels/initrds unless there is any reason/arguments to do not that.

P.S. Though to do not wait while btrfs root/boot will be fixed itself - I have found totally alternative way but absoutely much more perfect/ideal solution allowing to use BTRFS on first/boot volume on any Scaleway instance.

  1. Set INITRD_PRE_SHELL=1 tag root=/dev/vda1 or root=/dev/nbd0p1 (depending is it virtual or physical) when just creating new instance. Yes, it allowing to set root device this way via tag
  2. In brough shell execute following commands
    source /functions
    attach_nbd_device 0 #only if baremetal instance with NBD root device
    mount /dev/vda /newroot # or /dev/nbd0p1 if baremetal instance
    cd /newroot
    tar -cpzvf /rootfs.tar.gz ./
    cd /
    umount /newroot
    fdisk /dev/vda # or /dev/nbd0 if physical instance
    # Create in it root partition e.g, 2GB (or 5GB or 10GB or how much you need for just OS with all
    # necessary packages without actual data/payload) of desirable size leaving rest space for data 
    # partition  which can be added anytime later when booted OS and formated as BTRFS or LVM
    # or whatever anybody needs
    mkfs.ext4 /dev/vda1 # or /dev/nbd0p1
    mount /dev/vda1 /newroot # or /dev/nbd0p1
    cd /newroot
    tar -xpzvf /rootfs.tar.gz
    cd /
    umount /newroot
    exit
  3. Now instance should successfuly boot and now you can remove INITRD_PRE_SHELL=1 tag but keeping root=/dev/vda1 or root=/dev/nbd0p1 and it will work on any reboot without any care so we ended up with small root partition and lot of space on first boot volume for any partition(s) of you choice and needs. So no needs for extra volume (which is impossible in case of VC1S BTW and which is obvious waste of boot volume fo all other instance types taking into account also of Scaleway quota of total volumes amount per account). This should solve a problem for lot of Scaleway customer who asked as I seend for possibility of partition of first/boot volume, using LVM, BTRFS, XFS or cryptsetup/dm-crypt on it. This also tested and works with https://github.com/stuffo/scaleway-ubuntukernel/ at least for virtual instances

Also for those who asked for dm crypted root on Scaleway with LVM/BTRFS/XFS inside - I was able to boot this way with my own RancherOS fork https://github.com/tetatetit/os made specially for this, using KEXEC_KERNEL, KEXEC_INITRD and KEXEC_APPEND tags (though it still need to be polished for wider customer needs and I have lack of time now for this but if somebody interested - I can provide more info to help all things done).

Also I have and idea which I can implement even on my own how to allow Scaleway to boot full-grade images of any linux and/or barkley (FreeBSD, NetBSD, OpenBSD) with their own native kernels installed/upgraded and with all useful things coming from this (e.g. dkms allowing to install any distro and external packages requirying it like zfs, drbd 9, kernel extra drivers like in Ubuntu etc). Everything just like e.g. in Google/AWS instances but here working with baremetal too. The only reqirement that first partition should be boot containing native kernel and initrd of installed OS. But this requires deep cooperation with Scaleway team since it engages extra step before booting instance just to mount first partition on it first volume and to take native kernel from there and all modules from initrd plus modules necessary for NBD boot, to store them somewhere temporary and to expose kernel for iPXE and modules to Scaleway's own initrd. I.e. this way we boot only native kernel of any installed Linux/BSD with its own modules using Scaleway's own initrd. It should be pretty easy to find kernel and initrd location/path for most known OS and to extract modules and theid dependendcies/symbols files from initrd which can be found in standard location there under /lib/modules/. So we ending up with keeping same Scaleway architecture and approach with keeping instancess discless (keeping theird volumes in network storage) but with full-grade OS images with paritioning of first volume and native kernels with all advantages of it

dominic-p commented 6 years ago

@tetatetit I've recently been struggling with setting up an encrypted root fs on Scaleway, and I would love to hear the details on how you achieved it with kexec, if you have time.

There's been a long standing issue to support this functionality officially, but it looks like you found a way to get it running now.

mpiscaer commented 6 years ago

@tetatetit I had an little issue with booting after doing this procedure on ubuntu xenial. But after that I used parted and I resized the root volume and after that I formated the volume and tar the files back. All works find.