Open styeung opened 9 years ago
That's probably the stdout/stderr of the failure being larger than the size NATS allows per message. At some point in the past we added some code to catch that on the agent side and only send the last 100 lines of the message, but it's possible that the output had some really long lines (like megabytes in size). I thought we had a guard around that, but maybe not. I'm also not sure when the last time bosh-lite's stemcell was updated with the latest agent, but I assume it has this feature since it was added several month ago.
It could also be some other message response being too long...
@cppforlife got any other ideas?
On Tue, Feb 24, 2015 at 1:55 PM, Sai To Yeung notifications@github.com wrote:
Here https://gist.github.com/styeung/4e8b4d17057e4817e8df is our output from bosh task 3 --debug
— Reply to this email directly or view it on GitHub https://github.com/cloudfoundry/bosh-lite/issues/239#issuecomment-75854915 .
With that release, on this particular box, we are able to reproduce this error.
Our next steps are to remove .blobs
and .bosh/cache
and try again.
You said bosh cli plugin? I'm a dummy.
On Feb 25, 2015, at 8:46 AM, JT Archie notifications@github.com wrote:
With that release, on this particular box, we are able to reproduce this error.
Our next steps are to remove .blobs and .bosh/cache and try again.
— Reply to this email directly or view it on GitHub https://github.com/cloudfoundry/bosh-lite/issues/239#issuecomment-75997227 .
This is definitely stdout/stderr going over the limit due to how we use tar (verbose mode) in the Agent. Real problem here is that it fails to untar. This could be either due to invalid package cache or for some reason compilation stage did not successfully tar up the package. Since this is bosh-lite best way to go about it is to blow away that deployment and cf-release from the Director.
We'll adjust bosh-agent eventually to no log everything from tar command.
We are able to reproduce this error again. The strange part of it is that we can produce it on our CI machine, but unable to reproduce it on our dev machine, where the deployment of Bosh Lite and CF worked perfectly.
We were able to reproduce this bug by running this errand:
#!/bin/bash
#
for (( i = 0; i < 1024 * 1024 * 2; i++ )); do
echo "Hello!"
done
This was on a bosh-init deployed vSphere director, so not sure this is necessarily a bosh-lite problem.
It looks like the nats handler does not publish any of the message if it gets an error from the PerformHandler: https://github.com/cloudfoundry/bosh-agent/blob/fcb52b4f1aeae2c0c48e76c374b6f80354cbece5/mbus/nats_handler.go#L161-L164
Hi,
We tried deploying a trusty branch of CF-Release (https://github.com/cloudfoundry/cf-release/tree/trusty64-rootfs) and got the following error:
The full error log can be found here.
We've been able to successfully deploy before, and this is the first time we've seen this error message. What's causing this?
Thanks,
Sai To