firecracker-microvm / firecracker

Secure and fast microVMs for serverless computing.
http://firecracker-microvm.io
Apache License 2.0
26.36k stars 1.83k forks source link

[Bug] Concurrency Issues with Mounting via devtool build_rootfs #3058

Closed sondavidb closed 2 years ago

sondavidb commented 2 years ago

Describe the bug

Using tools/devtool build_rootfs multiple times concurrently results in the rootfs failing to mount.

To Reproduce

  1. Create a loop that runs many times concurrently (I did this 100 times, but I don't believe this many is needed). In each iteration: a. Make a new directory b. Clone firecracker in said directory c. Run tools/devtool build_rootfs
  2. Watch as errors relating to mounting will be silently reported in the output. (In my case, 61/100 times, I got mount: /firecracker/build/rootfs/mnt: failed to setup loop device for /firecracker/build/rootfs/bionic.rootfs.ext4.)

I compiled a shell script for the convenience of anyone reading this who wants to replicate this behavior. Please be advised this will likely require a fair bit of space, though all non-log files will be deleted at the end.

Make an empty directory with this script and run it, and it will replicate the steps above. Logs will be in the logs folder, with each instance having its own log (logs/logsx.txt, where x is the number of the iteration), and a master log of all of the logs combined (logs/logs.txt). Running it on an Amazon EC2 baremetal instance took me 25 minutes, but about 93% of the tests ran within the first half of this timeframe.

Each attempt to mount can be found by searching for "Mounting". Each error can be found in the log files by searching for "loop device".

Expected behaviour

No issues mounting each rootfs

Environment

[ - Firecracker version.]: v1.1 (currently commit e8e79564975c5bd9651eec47d73a6801f10ae493) [ - Host and guest kernel versions.]: Host - 5.10.112-108.499.amzn2.x86_64; Guest - Unsure (whatever the default rootfs uses) [ - Rootfs used.]: Default [ - Architecture.]: Host: x86_64 [ - Any other relevant software versions.]: None as far as I'm aware

Additional context

This bug makes pipeline testing unreliable, as tests can fail if they both try to mount the rootfs at the same time.

If this behavior is unintentional, I believe a simple fix could be to allow the user to specify a Firecracker rootfs directory and/or a mount directory that modifies the variables used in tools/devtool. (Currently, the rootfs directory is hardcoded, which conversely makes the mount directory hardcoded.) This would allow collision avoidance as multiple instances of build_rootfs will not be trying to mount to the same directory.

Checks

luminitavoicu commented 2 years ago

Hi @sondavidb, thank you for bringing this to our attention. Indeed, the fix here seems to be adding a way to specify a rootfs directory. If you would like to tackle this issue, we encourage contributions and we will be able to provide you with guidance if needed.