brooklyncentral / brooklyn

This project has moved and is now part of the ASF
https://github.com/apache/incubator-brooklyn
72 stars 27 forks source link

curl faling with "couldn't connect to host" on install entity #882

Open aledsage opened 11 years ago

aledsage commented 11 years ago

When installing the MarkLogic entity, aws-ec2:us-east-1 has been misbehaving itself this afternoon.

It's been giving the error: curl: (7) couldn't connect to host for a command that has previously worked reliably, which has worked on a subset of the VMs being started concurrently, and which worked when I ssh'ed in to try the command manually.

The command is:

curl -L -o  ~/MarkLogic--7.0-20130513.x86_64.rpm -L -O --user theusername:thepassword http://www.marklogic.com/download/MarkLogic--7.0-20130513.x86_64.rpm &

We need to write our entities to be more resilient to this kind of transient error.

aledsage commented 11 years ago

Suggest we do something like this:

triesRemaining=3
while [ $triesRemaining -gt 0 ]; do
  curl -L --retry 4 --continue-at - -o  ~/MarkLogic--7.0-20130513.x86_64.rpm -L -O --user theusername:thepassword http://www.marklogic.com/download/MarkLogic--7.0-20130513.x86_64.rpm
  result=$?
  if [ $result -eq 0 ]; then
    triesRemaining=0
  else
    triesRemaining=$(( $triesRemaining - 1 ))
    echo "Error downloading $fileName ($triesRemaining attempts remaining)"
    sleep 10
  fi
done

Note the --retry 4 for transient errors, the --continue-at - so that if the while loop tries again it picks up where it left off, and the while loop that will try the entire command 3 times.

For a general brooklyn solution, perhaps we want to only retry (and do sleep 10) on specific error codes such as 7 (couldn't connect to host).

ahgittin commented 11 years ago

something like that makes sense

alternatively wdyt about doing it from brooklyn java? using the new tasks stuff we could have a repeater task factory:

curl = "...";
repeater(ssh(curl)).until(Predicates.equals(0)).
    every(Duration.ONE_SECOND).timeout(Duration.FIVE_MINUTES).
    queue();

(where queue does the submission logic)

aledsage commented 11 years ago

In the MarkLogic case, the curl command is in the middle of a bigger script so java approach doesn't really apply.

But for where it's a single command, then yes makes sense. And I prefer writing loops/repeater in java than in bash...

Important to have the script version as well. Would be nice to build up a library of script methods that we upload to a given machine and can be called so we can do more in scripts. But that's another topic.

ahgittin commented 11 years ago

leveraging @rerun would be cool - https://github.com/rerun/rerun/wiki/Tutorial . also see Donnie Berkholz's blog - http://dberkholz.com/2011/04/07/bash-shell-scripting-libraries/ .