Closed antonst closed 8 years ago
btw, the second attempt succeeded
slightly off topic, but is it possible to get rid of the:
2016-05-13 16:03:09,738: radical.pilot : MainProcess : Thread-1 : ERROR : Couldn't call manager callback (no pilot instance)
When it pops up in the middle of a print messages it is quite confusing for the usee:
================================================================================
EnsembleMD (0.4-RC0)
================================================================================
Starting Allocation2016-05-13 16:03:09,738: radical.pilot : MainProcess : Thread-1 : ERROR : Couldn't call manager callback (no pilot instance)
ok
Verifying pattern ok
Starting pattern execution ok
--------------------------------------------------------------------------------
Executing 1 instances of 1 stages on 1 allocated core(s) on 'xsede.stampede'
Job waiting on queue...
Job is now running !
Waiting for stage_1 to complete. done
--------------------------------------------------------------------------------
Pattern execution successfully finished
Starting Deallocation..
Resource allocation cancelled. done
This looks like its coming from Pilot layer (or possibly SAGA):
Copying agent configuration file 'file://localhost/tmp/rp_agent_cfg_dirooWNz5/agent_0.cfg' to sandbox (file://localhost/home/antontre/radical.pilot.sandbox/rp.session.workflow.iu.xsede.org.antontre.016934.0000-pilot.0000/).
failed to run bootstrap: (127)(/bin/sh: /home/antontre/.saga/adaptors/shell_job/wrapper.sh: No such file or directory
I'll ping Andre regarding this once he is online.
That message shouldn't actually come if you are setting verbosity to REPORT. Wait, that's an RP message. Do you have any value set for RADICAL_PILOT_VERBOSE ?
Do you have any value set for RADICAL_PILOT_VERBOSE ?
no, I don't, RADICAL_PILOT_VERBOSE is set to None
re wrapper problem: yeah, that appeared a couple of times already. Seems like a saga change is not as backward compatible as we thought. What helps is to run rm -rf ~/.saga
on the target machine.
error log: the reporter is letting all messages with ERROR level through. The only fix would thus be to lower the log level on that message. I'll do that if I we happen to push a new release, otherwise that's something we'll have to live with...
Thanks Andre. I'm closing the ticket since its a known issue. Please let me know if there isn't a ticket for this in saga/rp, i'll create one.
I tried to run get_started.py locally on a workflow machine by:
and got this error: