Open btalbot opened 7 years ago
Thanks for the report. We could use some help narrowing down which version of Container Linux actually broke (that should make it easier to pinpoint what we changed that may have broken this). Are you able to walk through the Alpha releases and figure out which was the latest working and oldest broken release?
Are you able to walk through the Alpha releases and figure out which was the latest working and oldest broken release?
Yes. On these images, the reproducer completes successfully.
What doesn't work? Anything after 1465 which seems to include both kernel 4.12 and ignition 0.17. The reproducer hangs during coping files to the share.
Tentatively assuming this is a kernel problem.
This hang still occurs using 1520.1.0 (with kernel 4.13) released today.
Issue Report
Bug
In development environments using VirtualBox and Vagrant on OSX, writing to an NFS share hangs under some write loads. The hang occurs with the current alpha (1478.0.0), and beta (1465.2.0) but NOT with the current stable (1409.7.0).
The procedure shown below is just one way to hang the process and works 100% of the time for me. Anything that writes to the NFS share can hang: untar a tarball, using ruby
bundle install
, etc. The hang happens with no containers and when writing from docker containers.Container Linux Version
Environment
Expected Behavior
The write completes successfully.
Actual Behavior
The write process hangs and cannot be killed. If the write is from a container process, the container cannot be stopped.
Reproduction Steps
git clone https://github.com/coreos/coreos-vagrant
cd coreos-vagrant
echo '$share_home=true' > config.rb
vagrant up
vagrant ssh -c "cd $PWD && mkdir hangme && cp -rv /usr/share/ hangme/"
Other Information
If the NFS mount options are tuned to be "soft" and timeout, I've seen the writing process timeout trying to close the file. Other work-arounds for similar osx-vagrant nfs permissions issues seem to have no effect --
ls -alR > /dev/null
makes no difference.The hang happens with older versions of VirtualBox and vagrant as well, but seemingly not with older versions of CoreOS.
The same steps above when run using the current
stable
branch completes successfully. This can be tried by adding a "3.5" step to the above assed -i '' -e 's/alpha/stable/' Vagrantfile
.