tmatilai / vagrant-proxyconf

Vagrant plugin that configures the virtual machine to use proxies
MIT License
531 stars 74 forks source link

Docker provider: Guest-specific operations were attempted on a machine that is not ready for guest communication. #181

Closed h-no closed 5 years ago

h-no commented 5 years ago

When vagrant-proxyconf is installed I get an error message (see below) when starting a container by running vagrant up for the first time. It seems to work if I run vagrant up a second time. I initially wrote an issue for the vagrant project, but later discovered that it worked when uninstalling the plugin, so I'm filing the issue here instead. (That's why the issue follows their template....)

Vagrant version

Vagrant 2.2.3

vagrant-proxyconf version

vagrant-proxyconf (2.0.0, global)

Host operating system

macOS Mojave Version 10.14.2

Guest operating system

Docker provider, https://hub.docker.com/_/hello-world

Docker Community Edition Version 2.0.0.2 (30215) Docker Engine version 18.09.1, build 4c52b90

Vagrantfile

Vagrant.configure("2") do |config|
  config.vm.provider "docker" do |d|
    d.image = "hello-world"
  end
end

Debug output

First run: https://gist.github.com/h-no/d17bab1f77ffbe06f27689935e4d50a4 Second run: https://gist.github.com/h-no/7f877971dd533bce9d1374db953b8aa2

Expected behavior

Container should've run on initial vagrant up command, independently of vagrant-proxyconf plugin is installed or not.

Actual behavior

Running vagrant up doesn't start container, instead gives an error. Running vagrant up again immediately after, the output is different.

❯ vagrant up
Bringing machine 'default' up with 'docker' provider...
==> default: Creating the container...
    default:   Name: test_default_1547634778
    default:  Image: hello-world
    default: Volume: /Users/hannesna/vagrant/test:/vagrant
    default:
    default: Container created: b2769a85f40c5532
==> default: Starting container...
Guest-specific operations were attempted on a machine that is not
ready for guest communication. This should not happen and a bug
should be reported.

❯ vagrant up
Bringing machine 'default' up with 'docker' provider...
==> default: Vagrant has noticed that the synced folder definitions have changed.
==> default: With Docker, these synced folder changes won't take effect until you
==> default: destroy the container and recreate it.
==> default: Starting container...
==> default: Provisioners will not be run since container doesn't support SSH.

Steps to reproduce

  1. Create Vagrantfile with contents above
  2. Run vagrant up and notice that the container doesn't start and the error message
  3. Run vagrant up again, the container starts with different message
  4. Run vagrant destroy
  5. Run vagrant plugin uninstall vagrant-proxyconf
  6. Run vagrant up and notice that container starts ok

References

codylane commented 5 years ago

I apologize for the problems, I hope to be able to have time to take a look into this in the new few days. Thank you for reporting and for providing all those details.

codylane commented 5 years ago

I've not had time to test this yet, but I noticed a few minor things that may contribute to the behavior your are seeing.

The first is, I've not tested this plugin onVirtualBox 6.x or Vagrant 2.2.3 and I noticed this in your debug log. This plugin may work with those versions, however, I've not been able to test those versions. It could be possible there is a problem there as well but I cannot say for certain.

Disclaimer - These are my words and I'm just a contributer but I feel that this plugin is hard to grasp and understand when looking at the code and I believe the issue you are seeing is buried deep in the code and possibly in areas of it that I just don't understand well enough to sort out. I suspect the problem is manifesting in is_enabled.rb where the container is trying to be configured when not in the :running state, but like I said, I'm not sure if making a change there would cause different problems down the road.

When vagrant-proxyconf 2.0.0 is installed as a plugin, by default it will attempt to configure a proxy which means that the plugin attempts to enable all the proxy providers by default, which may cause issues with some folks setups.

If you don't want to go through the hassle to uninstall the vagrant-proxyconf everytime you want to disable the plugin you can do something like this in the Vagrantfile

FYI - This should be a workaround for the behavior that you are seeing and I would highly recommend all your Vagrantfile's be updated to use this pattern if the vagrant-proxyconf plugin is installed.

Vagrant.configure("2") do |config|

    if Vagrant.has_plugin?("vagrant-proxyconf") && ENV['HTTP_PROXY'].nil? == false
      config.proxy.http      = "#{ENV['HTTP_PROXY']}"
      config.proxy.https     = "#{ENV['HTTPS_PROXY']}"
      config.proxy.no_proxy  = "#{ENV['NO_PROXY']}"
      config.apt_proxy.http  = "http://#{ENV['HTTP_PROXY']}"
      config.apt_proxy.https = "http://#{ENV['HTTPS_PROXY']}"
    else
      # If we get here either the vagrant-proxyconf is not installed or the current USER environment 
      # does not have the HTTP_PROXY environment variables set.
      # so we disable all vagrant-proxyconf plugins.
      config.proxy.enabled = false
    end

    config.vm.provider "docker" do |d|
      d.image = "hello-world"
    end
end

We've attempted to highlight this in the documentation. I do agree the plugin needs some more clean up as well as documentation overhaul but my time is limited. If you have time to troubleshoot and implement a fix, we will accept a pull request.

Thanks for reporting this issue.

tmatilai commented 5 years ago

I'm not familiar with the docker specific parts of the code base either as it is contributed by others, but could this be related to #81?

codylane commented 5 years ago

I've determined this message

Guest-specific operations were attempted on a machine that is not
ready for guest communication. This should not happen and a bug
should be reported.

is caused due to a race condition that happens when using docker and the container is in the :stopped state, so there needs to be some new logic in is_enabled.rb to wait for the "guest" to be in the running state before it can be configured.

codylane commented 5 years ago

Hey there,

I'm not familiar with the docker specific parts of the code base either as it is contributed by others, but could this be related to #81?

The result for the first time run noted here is related to this issue. The root cause is the is_enabled.rb#plugin_enabled? returns true by default when the vagrant-proxyconf plugin is installed and there is no configuration for the plugin. This causes a weird race condition that happens only sometimes because the container is not in the running state before this line attempts to "communicate" with the guest and hands off to base.rb#call.

The easiest first solution that I can see right now is to is change this line to something like this:

        def plugin_enabled?(config)
          # we need to disable the plugin if there is no configuration for the plugin to avoid
          # the guest from being configured.  
          disabled = (config.enabled == false || config.enabled == '' || config.enabled.nil?  || config.enabled == {})
          return false if disabled == true

          # if we get here, then we assume the proxy plugin is enabled, however we still need to make sure
          # the guest is in the :running state.  Perhaps https://github.com/josenk/vagrant-vmware-esxi/blob/21cd179cde2f9c2fc9aa77da74e72df679719f19/lib/vagrant-vmware-esxi/action/wait_for_state.rb will help.
          true
        end

FWIW - It was for this reason why I commented out the conditional to "skip" configuration the plugin here. However, when that is uncommented, we set ourselves backup for failure if we hit that race condition of the guest not being in the proper state AND we lose the disablement of the proxy after it has been configured.

However the race condition of "waiting" for the guest to be in a state that we can communicate with it is the bigger challenge and will require more refactoring. I'm not so sure the time implement a fix is worth it as there are much easier ways to run docker without using Vagrant.

tmatilai commented 5 years ago

Sorry, too tired now to dive into the debug logs and code, but Vagrant has a WaitForCommunicator action that should be earlier in the middleware stack. So there really is something strange (or regression) there.

codylane commented 5 years ago

No problem @tmatilai. I agree that it feels like there might be a edge condition and or a regression that caused this behavior but's hard to troubleshoot since it only happens every once in a while. It's also hard to find examples of how to correctly configure the middleware stack without sifting through tons of vagrant plugins and providers.

I do wonder how hard it would be to add something in there for WaitForCommunicator? I'll take a look and respond back if I come up with something.

h-no commented 5 years ago

@codylane Let me know if you need help reproducing, I manage to reproduce it 100% on two different computers.

codylane commented 5 years ago

@h-no - It's all good, I can also reproduce this. I'm fairly certain I know what is causing this and it's buried deep in the Vagrant middleware stack.

Basically it boils down to a problem when using Docker provider the guest i.e the (hello-world) container in your example isn't always in the "running" state before Vagrant tries to communicate with it. I created a small little bash script yesterday to reproduce this behavior.

This probably could be better off as a gist, but....

Environment Conditions

Given this Vagrantfile

# Vagrantfile

Vagrant.configure("2") do |config|

    config.vm.provider "docker" do |d|
      d.image = "hello-world"
    end
end

And Running this

for i in $(seq 1 10)
do
  vagrant destroy --force; vagrant up
done

Will reproduce the error you see:

==> default: Deleting the container...

Bringing machine 'default' up with 'docker' provider...
==> default: Creating the container...
    default:   Name: issue_180_default_1548094272
    default:  Image: hello-world
    default: Volume: ./:/vagrant
    default:  
    default: Container created: e6b92d5e3689e7f9
==> default: Starting container...
Guest-specific operations were attempted on a machine that is not
ready for guest communication. This should not happen and a bug
should be reported.

The Workaround (for now)

Vagrant.configure("2") do |config|

    if Vagrant.has_plugin?("vagrant-proxyconf") && ENV['HTTP_PROXY'].nil? == false
      config.proxy.http      = "#{ENV['HTTP_PROXY']}"
      config.proxy.https     = "#{ENV['HTTPS_PROXY']}"
      config.proxy.no_proxy  = "#{ENV['NO_PROXY']}"
      config.apt_proxy.http  = "http://#{ENV['HTTP_PROXY']}"
      config.apt_proxy.https = "http://#{ENV['HTTPS_PROXY']}"
    else
      # If we get here either the vagrant-proxyconf is not installed or the current USER environment 
      # does not have the HTTP_PROXY environment variables set.
      # so we disable all vagrant-proxyconf plugins.
      config.proxy.enabled = false
    end

    config.vm.provider "docker" do |d|
      d.image = "hello-world"
    end
end

When using that loop utility (above), after 10 runs, I don't see the "cannot communicate with guest error". This is not the permanent fix, just a temporary one. This "masks" the problem, it doesn't fix it.

==> default: Deleting the container...

Bringing machine 'default' up with 'docker' provider...
==> default: Creating the container...
    default:   Name: issue_180_default_1548094529
    default:  Image: hello-world
    default: Volume: ./:/vagrant
    default:  
    default: Container created: 686093a93814b190
==> default: Starting container...
==> default: Skipping configuration of apt_proxy
==> default: Skipping configuration of chef_proxy
==> default: Skipping configuration of docker_proxy
==> default: Skipping configuration of env_proxy
==> default: Skipping configuration of git_proxy
==> default: Skipping configuration of npm_proxy
==> default: Skipping configuration of pear_proxy
==> default: Skipping configuration of svn_proxy
==> default: Skipping configuration of yum_proxy
==> default: Provisioners will not be run since container doesn't support SSH.
codylane commented 5 years ago

FYI - The pull request I just linked to this issue should address the first issue;

The vagrant-proxyconf plugin is installed AND no config.proxy.* are set in a Vagrantfile

Vagrant.configure("2") do |config|

    config.vm.provider "docker" do |d|
      d.image = "hello-world"
    end
end

The pull request addresses this problem by disabling the vagrant-proxyconf plugin.

The second feature of this pull request is to add additional logic to check if a proxy should be configured

Given either of the following environment variables being set

export HTTP_PROXY=http://$my_proxy_host:$my_proxy_port or export HTTPS_PROXY=https://$my_proxy_host:$my_proxy_port

AND

Given this Vagrantfile

Vagrant.configure("2") do |config|

    if Vagrant.has_plugin?("vagrant-proxyconf")
      config.proxy.http      = "#{ENV['HTTP_PROXY']}"
      config.proxy.https     = "#{ENV['HTTPS_PROXY']}"
      config.proxy.no_proxy  = "#{ENV['NO_PROXY']}"
    end

    config.vm.provider "docker" do |d|
      d.image = "hello-world"
    end
end

The vagrant-proxyconfg plugin would be enabled for all capable providers on the guest. However, if you didn't set HTTP_PROXY or HTTPS_PROXY in your environment before running vagrant up or vagrant provision then there is new logic to disable vagrant-proxyconf from attempting to configure your guest.

The new logic is an attempt to auto-configure the vagrant-proxyconf providers based off your current environment proxy variables.

codylane commented 5 years ago

2.0.1 has been released. Please re-open if this is still an issue.