documentcloud / cloud-crowd

Parallel Processing for the Rest of Us
https://github.com/documentcloud/cloud-crowd/wiki
MIT License
851 stars 92 forks source link

RestClient::ServerBrokeConnection error when submitting any job #15

Closed antunderwood closed 14 years ago

antunderwood commented 14 years ago

I have recently updated from crowd cloud 0.2.7 to 0.4.1

I am running into a problem now when submitting jobs. Initially I thought it was my code but even if I try the example from the wiki below it fails with the error

RestClient::ServerBrokeConnection: RestClient::ServerBrokeConnection from /usr/local/lib/ruby/gems/1.8/gems/rest-client-1.5.0/lib/restclient/request.rb:137:in transmit' from /usr/local/lib/ruby/gems/1.8/gems/rest-client-1.5.0/lib/restclient/request.rb:55:inexecute' from /usr/local/lib/ruby/gems/1.8/gems/rest-client-1.5.0/lib/restclient/request.rb:30:in execute' from /usr/local/lib/ruby/gems/1.8/gems/rest-client-1.5.0/lib/restclient.rb:72:inpost' from (irb):22 It did the same with version 1.4.2 of the restclient gem.

Do you have any suggestions as to what the problem could be?

Thanks Anthony

 RestClient.post('http://localhost:9173/jobs', {:job => { 'action' => 'structural_analysis', 'inputs' => [ 'http://www.gutenberg.org/a_midsummers_nights_dream.txt', 'http://www.gutenberg.org/romeo_and_juliet.txt', 'http://www.gutenberg.org/titus_andronicus.txt', ], 'options' => { 'limit' => 20, 'variance' => 0.75 } }.to_json} )

jashkenas commented 14 years ago

Hi Anthony.

I'm not sure what the precise problem is, but it looks like it's happening on the other end of the connection from that stack trace. If you could look at both the "server.log" and the "node.log", and see what the specific exception was, that would be very helpful.

Note that there is no actual "structural_analysis" action installed with CloudCrowd -- that's just a made-up example.

jashkenas commented 14 years ago

Anthony replied via message:

log/node.log reads

Writing PID to /usr/local/cloud-crowd/tmp/pids/node.pid Thin web server (v1.2.4 codename Flaming Astroboy) Maximum connections set to 1024 Listening on 0.0.0.0:9063, CTRL+C to stop Failed to connect to the central server (http://158.119.147.51:9173). Exiting!

Not quite sure what is going on, was OK at version 0.2.7

Thanks for any assistance. Anthony

It looks like your node never started, being unable to connect to the central server. Make sure that your server is started first, and that the node is able to connect to it, and starts cleanly, and you should be ready to go.

Remember that you can always visit the Operations Center to check that your nodes are online...

antunderwood commented 14 years ago

It turns out that this was due to the fact that I still had some processes from the old version of crowd running. Once I had quit these and restarted the server and nodes the problem seemed to resolve itself.

I did run into another problem though. In my old actions after the split I converted the outputs to json as per http://gist.github.com/185010. This is now no longer needed ( as I now see in commit http://github.com/documentcloud/cloud-crowd/commit/d135b5dc5439b294a9f562a07918b9fd7fd4c9b5. Could the example action in the gist be updated since it is referred to in the wiki.

Thanks Anthoy

jashkenas commented 14 years ago

Updated the gist. Thanks for the note.