`vagrant-spk up` hangs on "default: Running: inline script"

dwrensha commented 8 years ago

On OSX 10.10.5, with Vagrant 1.7.2, if I do vagrant-spk setupvm uwsgi and then vagrant-spk up, I see this, and then the script hangs:

Calling 'vagrant' 'up' in /Users/dwrensha/Desktop/test-vagrant-spk/.sandstorm
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'debian/jessie64'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'debian/jessie64' is up to date...
==> default: A newer version of the box 'debian/jessie64' is available! You currently
==> default: have version '8.1.0'. The latest is version '8.2.0'. Run
==> default: `vagrant box update` to update.
==> default: Setting the name of the VM: test-vagrant-spk_sandstorm_1444573630
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 6080 => 6080 (adapter 1)
    default: 22 => 2222 (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2222
    default: SSH username: vagrant
    default: SSH auth method: private key
    default: Warning: Connection timeout. Retrying...
    default: 
    default: Vagrant insecure key detected. Vagrant will automatically replace
    default: this with a newly generated keypair for better security.
    default: 
    default: Inserting generated public key within guest...
    default: Removing insecure key from the guest if its present...
    default: Key inserted! Disconnecting and reconnecting using new SSH key...
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Mounting shared folders...
    default: /opt/app => /Users/dwrensha/Desktop/test-vagrant-spk
    default: /vagrant => /Users/dwrensha/Desktop/test-vagrant-spk
    default: /host-dot-sandstorm => /Users/dwrensha/.sandstorm
==> default: Running provisioner: shell...
    default: Running: inline script
==> default: Created symlink from /etc/systemd/system/multi-user.target.wants/sandstorm.service to /etc/systemd/system/sandstorm.service.
==> default: Running provisioner: shell...
    default: Running: inline script

The problem goes away if I revert this commit: https://github.com/sandstorm-io/vagrant-spk/commit/116f9434c2b6c058dedc2f83763b799aa05330b4

dwrensha commented 8 years ago

I observed the same problem with the "lemp" and "static" stacks, but not with the "meteor" stack.

zarvox commented 8 years ago

Interesting. In a test I did with the lemp stack, setup.sh completes running (because I can see that the last change it makes is present on the FS), but never exits.

Reverted until we find another solution.

paulproteus commented 8 years ago

Fascinating. Was just going to investigate.

Drew, it seems like you did just revert it. Many thanks.

It also seems that you did a push to master without a pull request to close this, which means I didn't know you were fixing it. Pull requests cause email notifications, which are helpful for me. Can I convince you to do pull requests plus self-merges?

Also this suggests I should get on that task of making a test suite for vagrant-spk so we catch this stuff, not users.

zarvox commented 8 years ago

@paulproteus yeah, I can do the PR/self-merge thing in the future. Sorry 'bout that.

Also I misspoke: setup.sh does exit, it's just vagrant never finishes provisioning, for whatever reason.

paulproteus commented 8 years ago

Fascinating. (And no huge deal about the self-merge; the important thing is that we're delivering better software thanks to you doing the revert.)

zarvox commented 8 years ago

The ssh process terminates and enters zombie state, but ruby-mri never reaps the child.

zarvox commented 8 years ago

No, that's only when I ^C the provisioning process and it doesn't kill the children properly, never mind. (Perhaps we need to set some option to subprocess to kill children when the leader exits?)

ruby-mri just sits there calling futex() every 4 seconds or so indefinitely.

Anyway, thanks to @dwrensha for bisecting the root cause.

sandstorm-io / vagrant-spk

`vagrant-spk up` hangs on "default: Running: inline script" #75