vmware-archive / vmware-vmware_lib

VMware-common
Other
12 stars 27 forks source link

SystemStackError exception from transport.rb in puppet agent mode. #59

Closed crayfishx closed 8 years ago

crayfishx commented 8 years ago

When using vc_datacenter resources from the vmware_vcenter module I see the following behaviour from Puppet when run in agent/master mode using a manifest containing one

Error: Could not run Puppet configuration client: stack level too deep
Error: Could not run: no implicit conversion of Puppet::Util::Log into Integer

The same code compiles fine with puppet apply.

I've narrowed this down to the following class override in puppet_x/puppetlabs/transport.rb from vmware_lib

https://github.com/crayfishx/vmware-vmware_lib/commit/cbce7baab1ddc1d7635dd85d18590783a47b8f26#diff-4875e35939d031ba6fdcad5f69ef824bR8

This seems to cause puppet agent to get stuck in somewhat of a loop, calling it over a 10,000 times before bombing out with a SystemStackError (as opposed to the same puppet manifest using puppet apply which calls it just 3 times)

If I comment this out the catalog compiles fine under agent mode.

This is diving in quite deep into Puppet internals so I'm not sure what @nanliu was trying to achieve with this commit.

maniacmurphy commented 8 years ago

@crayfishx, what version of puppet are you running? Do you gave a sample manifest that throws this error?

ggeldenhuis commented 8 years ago

From slack channel:Testing against puppet 3.8.7, but I also get the same on 4.2.2 Puppet apply works fine for both declarations - but in agent/master mode I see the problem described in the ticket.

nanliu commented 8 years ago

You won't see immediate failures without this patch. This is a hook to ensures you close the remote connection, otherwise puppet leaves dangling connections. For example:

https://github.com/vmware/vmware-vmware_lib/blob/master/lib/puppet_x/puppetlabs/transport/ssh.rb#L47-L50

This was done in transaction to ensure puppet will close the connection even if resources fails. In newer versions of puppet, there might be hooks to permit this without monkey patching transaction.

nanliu commented 8 years ago

I believe the new mechanism is called post_resource_eval: https://github.com/puppetlabs/puppet/blob/master/lib/puppet/transaction.rb#L132

In this case you really only need it per transport, but it's hooked into the provider layer. Maybe you can write it in the parent provider and ensure it's only done once.

One more thing, puppet master/agent still does not have environment isolation for type/providers. So you may eventually converge on puppet apply instead of agent once you have multiple versions of code. Maybe you have some tricks up your sleeves getting around: https://tickets.puppetlabs.com/browse/SERVER-94.

@crayfishx, ping me on IRC if you have further questions. HTH.

crayfishx commented 8 years ago

@nanliu - do you have an example of how to simulate the problem with hanging connections without this patch? - I'm run puppet agent in daemon mode using the vsphere transport from vmware-vcenter module and no matter how I try to break it connections are not left open.... Does this only affect ssh connections?

Right now im having to comment this whole patch out in order for the module to work under puppet agent, but can't simulate the conditions that it was originally intended to fix.....

nanliu commented 8 years ago

@crayfishx, haven't worked on the code recently, but this should be the way to reproduce:

  1. the transport must implement .close(), otherwise this is inconsequential.
  2. the puppet agent/apply has to fail a vcenter resource (if the puppet run is successful, the normal transport closed will be invoked).
  3. run the puppet agent/apply until enough dangling connections stays open.

vsphere appears to implement close method. I guess the easiest way to verify if it's a problem for vsphere transport is to keep opening new rbvmomi connections without closing it and see if it ever gets rejected.

crayfishx commented 8 years ago

Fixed in #60