rpodgorny / unionfs-fuse

union filesystem using fuse
Other
296 stars 76 forks source link

random-ish I/O errors when using unionfs with nfs #96

Open mrvn opened 3 years ago

mrvn commented 3 years ago

I have a kind of convoluted setup for building boot images for a high performance computing cluster consisting of 4 layers:

  1. a unionfs-fuse over a pristine chroot, a dir with stuff for the image, a dir with stuff with test cases only used during build. This also has a plugin that outputs all accessed files into a log.
  2. the unioned filesystem is then exported via NFS kernel server
  3. kvm with mini initramfs to setup step 4 and pivot_root to unionfs
  4. unionfs over a tmpfs and NFS

The KVM instance boots up and runs a bunch of test cases for all the tools that belong in the boot image and every accessed file is logged by step 1. This gives us a list of files needed in the boot image allowing us to create minimal boot images.

Now the problem is that randomly the test cases get an I/O error. This either causes a Bus Error in an application itself or reading some file fails. This is fatal to ~80% of build attempts at the moment for one specific image and one user as it hits a essential systemd service file. Works fine for another user. Works better when the build server is freshly booted and seem to get slightly worse over time. Something fishy is going on there.

Are there any known random failures with either tmpfs or nfs as branches? Or do you have tips for debugging this without getting a billion lines of strace output?

rpodgorny commented 3 years ago

hi!

...unfortunately, i don't know any known bugs that would be somehow specific to nfs or tmpfs. also, your setup seems too complicated to draw any conclusion or to give better advice than "try to remove some of the layers" (just for testing purposes)... :-(

anyway, if you manage to find the problem and it's really caused by unionfs, i'd love to hear back from you, thanks!

rpodgorny commented 2 years ago

@mrvn hi! i've just release the v3.2 version with some nfs fixes (among others) -> could you please try if this fixes your problems? thanks!