Closed antonst closed 9 years ago
Hi Taisung, I will address you issues one by one.
1 You simply need to specify your project in config file, like so:
"input.PILOT": { "resource": "stampede.tacc.utexas.edu", "username" : "antontre", "project" : "TG-MCB140109", "runtime" : "20", "cleanup" : "False", "cores" : "16" },
This is also covered in config.info file: https://github.com/radical-cybertools/RepEx/blob/master/src/radical/repex/config/config.info#L37 but I agree this should be explained explicitly in readme.
2 You are right there is no script for scheme 1. I agree this is a bit confusing, but the main reason for this is because scheme 1differs from scheme 2 only in number of CPU cores used. Again, documentation should address this confusion. (Fixed!)
3 Fixed!
4 Agree! and this is what I am currently working on. Users working directory should not be in src. Will catch up on this later.
Thanks, Antons
1 The critical one that prevents me to go further is that I cannot find a place to put my XSEDE account info—I have two accounts and I need to specify the one to use, otherwise the job submission fails
ga_sshtaisung%h_%p.tg458185.ctrl tg458185@stampede.tacc.utexas.edu
2014:07:17 12:16:53 radical.pilot.PilotLauncherWorker-1: [ERROR ] Pilot launching failed: Couldn't get job id from submitted job! sbatch output:
--> Verifying valid submit host (login1)...OK
--> Verifying valid jobname...OK
--> Enforcing max jobs per user...OK
--> Verifying availability of your home dir (/home1/00661/tg458185)...OK
--> Verifying availability of your work dir (/work/00661/tg458185)...OK
--> Verifying availability of your scratch dir (/scratch/00661/tg458185)...OK
--> Verifying valid ssh keys...OK
--> Verifying access to desired queue (normal)...OK
--> Verifying job request is within current queue limits...OK
--> Checking available allocation
ERROR: You have multiple projects to charge to - please specify
which project you would like to charge against via the following
directive:
Note that your available projects are:
FAILED
(/usr/people/taisung/myenv/lib/python2.7/site-packages/saga_python-0.14-py2.7.egg/saga/adaptors/slurm/slurm_job.py +588 (_job_run) : " sbatch output:\n%s" % out))
2 The scheme numbering seem confusing: there is no Scheme #1 scripts in the src/radical/repex. In https://github.com/radical-cybertools/RepEx, the example to run Scheme #1 and # 2 seem exactly the same.
3 In https://github.com/radical-cybertools/RepEx, there are some mistakes about invoking amber—the “namd_input.json” was used in some amber examples.
4 I cannot copy everything in src/radical/repex to other location and run it. Maybe the right procedure needs documented? In my opinion, src/radical/repex should be a template, user should create their own working/data directory, not working in src/radical/repex.
Taisung