apigee / henchman

Orchestration and Automation tool
BSD 2-Clause "Simplified" License
109 stars 19 forks source link

Henchman sometimes fails with "unexpected end of JSON input" Error #95

Open baskaran-md opened 8 years ago

baskaran-md commented 8 years ago
02:06:39 echo '{"cmd":"yum install -y edge-router --enablerepo=epel --nogpgcheck","loglevel":"debug"}' | /bin/bash -c 'sudo -H -u root ${HOME}/.henchman/shell'
02:06:53 unexpected end of JSON input

Not stacktrace /error log found!

jlin21 commented 8 years ago

Is this being run with the latest release?

jlin21 commented 8 years ago

Actually still fixing the latest release, something is up with the copy module

jlin21 commented 8 years ago

should be ready to go now

baskaran-md commented 8 years ago
00:04:17.471 
00:04:29.129 time="2015-12-03T02:27:50Z" level=error msg="Error running task 'Install Apigee RPM'" error="While in exec_module :: While unmarshalling task results :: unexpected end of JSON input" host=10.17.6.32 plan="Install Router" task="Install Apigee RPM" 
jlin21 commented 8 years ago

The issue at hand is some of the data is being dropped from ssh.Exec. So if the output should return {"status": "changed", "msg": "hello world"} sometimes the output will be {"status": "changed", "msg": "hello. The second output will result in the JSON input error because it does not follow proper JSON. So the temp fix to this is to slap on "} to the end. This will result in incomplete outputs for the time being. The user will be notified when this occurs though.

jlin21 commented 8 years ago

Notes:

This could be an underlying issue with ssh.go

baskaran-md commented 8 years ago

Any help on this? @madhurranjan / @sudharsh . This keeps happening and coulndt proceed with other tasks on playbook unless i have "ignore_errors: true" for that task.

jlin21 commented 8 years ago

@baskaran-md Does temp fix prevent henchman from throwing "End of JSON Input error?". It should. If it does happen let me know please. We're still trying to find the root cause of that issues

sudharsh commented 8 years ago

Have removed the explicit appending of '}' if this occurs as more log output is needed

jlin21 commented 8 years ago

@sudharsh did you get a chance to test this yet?

sudharsh commented 8 years ago

Here's what we know so far to be able reproduce this (1 in 3-5 times)

1.) The target node has to be, * amazon-linux No problems so far in others including centos 7.x, centos 6.5 and rhel 7.2 2.) Python version is 2.7.10 3.) There should be atleast one call to the Popen() constructor. Doesn't matter if this invocation is being used or is in the critical path. The moment a call to Popen() is made, things become unreliable.

sudharsh commented 8 years ago

fat-fingered 'Close and comment'

sudharsh commented 8 years ago

Tried downgrading 2.7.10 to 2.7.5 on amazon linux. No dice.

sudharsh commented 8 years ago

Here's another clue, https://bugs.python.org/issue19612

jlin21 commented 8 years ago

Issue resolved when shell module was ported to go

jlin21 commented 8 years ago

go shell module will break at 1.4 mil length for output