Open dgordon562 opened 7 years ago
fc_run.py cfg >&fc_run.out
>&
has a meaning in Bash. I assume you meant:
fc_run.py cfg > fc_run.out
grep ... 'Num still unsatisfied'
No, you cannot rely on that, since it relates to system vagaries.
Do you have a better idea?
thanks, Chris.
Does Falcon still look for the done flags? If so, which .py file has the code that looks for the done flags? If not, how does falcon detect when a qsub'd job has completed?
Thanks! David
While you're at it, could you point me to the file where the qsub is actually done:
the equivalent of the old _qsub_script in run.py
Does Falcon still look for the done flags?
There are several process-watcher backends now. They all work differently.
pwatcher_type=blocking
. In that case, submitted jobs are done when the blocking calls return. Very simple, but re-acquiring a running job would never be possible. (That feature has never been implemented anyway.) If you're interested, I can show you how to use it.fs_based
still works, but filesystems are inherently finicky. That relies on some done files (in the mypwatcher
dir by default).network_based
is similar to fs_based
, but it relies on network socket communication to learn what is done. It also sends logs over the socket.The dependency graph may also use some "done" files, but those are completely separate from job-submissions. I plan (very soon), to make all tasks create a done file for the dependency graph. (I first had to switch to a newer, simpler workflow engine, and I had to change all Jason's scripts to use it, including in FALCON-unzip. That's 15-20 scripts, so it took some time.)
So things are getting much simpler, but progress takes time.
While you're at it, could you point me to the file where the qsub is actually done? the equivalent of the old _qsub_script in run.py
You should see the qsub lines in the log if your code is completely up-to-date. We do not currently dump a bash script which contains the qsub line. But we do usually dump bash scripts on the remote hosts (e.g. 0-rawreads/prepare_rdb.sh
). The file task.json
is the key. It is loaded by a python module/program called pypeflow/do_task.py
. With pwatcher_type=blocking
I think that is run by task.sh
, and that is run by run.sh
, and that is run by run-P....sh
, which is run by qsub.
Hi, Chris,
How about this idea:
Run falcon like this: fc_run.py cfg >&fc_run.out
grep for these lines:
[INFO]Num still unsatisfied: 23
To get the total number of daligner jobs, count daligner lines in 0-rawreads/run_jobs.sh.
The difference gives the # of daligner jobs completed.
Will this work even if falcon has been restarted several times and fc_run.out is just the most recent copy of stdout/stderr of fc_run.py?
Do you have a better idea (that doesn't involve reading lots of files)?
Thanks! David