hillerlab / make_lastz_chains

Portable solution to generate genome alignment chains using lastz
MIT License
44 stars 8 forks source link

PBS jobs were submitted, but failed quickly #7

Closed Alex-Lin-kiz closed 1 year ago

Alex-Lin-kiz commented 2 years ago

Dear Prof @MichaelHiller

Thank you for developing such great software!

I'm having some problems with make_lastz_chain, everything seems to be working fine at first,

************ HgStepManager: executing step 'lastz' Wed Aug 31 22:44:00 2022.
doLastzClusterRun ....
testCleanState current step: doLastzClusterRun, /gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap-macFas/TEMP_run.lastz, /gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap-macFas/TEMP_run.lastz/lastz.done     previous step: doPartition, /gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap-macFas/TEMP_run.lastz/partition.done
# chmod a+x /gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap-macFas/TEMP_run.lastz/doClusterRun.sh
# /gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap-macFas/TEMP_run.lastz/doClusterRun.sh
+ cd /gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap-macFas/TEMP_run.lastz
+ gensub2 homSap.lst macFas.lst gsub jobList
+ parallel_executor.py lastz_homSapmacFas jobList -q day --memoryMb 10000 -e pbs --co None -p mem3T --eq 10
N E X T F L O W  ~  version 21.10.6
Launching `/gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap-macFas/TEMP_run.lastz/lastz_homSapmacFas/script.nf` [peaceful_booth] - revision: 01f1105dc4

but when the program submits the task, it keeps giving errors

executor >  pbs (12)
[bb/ba7c51] process > execute_jobs (13) [  0%] 4 of 2446, failed: 4, retries: 4
[70/224df0] NOTE: Process `execute_jobs (26)` terminated with an error exit status (127) -- Execution is retried (1)
[a4/c418e8] NOTE: Process `execute_jobs (1)` terminated with an error exit status (127) -- Execution is retried (1)
[06/393f6f] NOTE: Process `execute_jobs (2)` terminated with an error exit status (127) -- Execution is retried (1)
[19/e38b9b] NOTE: Process `execute_jobs (8)` terminated with an error exit status (127) -- Execution is retried (1)

executor >  pbs (13)
[d1/24b0e5] process > execute_jobs (3)  [  0%] 10 of 2452, failed: 10, retries: 10
[70/224df0] NOTE: Process `execute_jobs (26)` terminated with an error exit status (127) -- Execution is retried (1)
[a4/c418e8] NOTE: Process `execute_jobs (1)` terminated with an error exit status (127) -- Execution is retried (1)
[06/393f6f] NOTE: Process `execute_jobs (2)` terminated with an error exit status (127) -- Execution is retried (1)
[19/e38b9b] NOTE: Process `execute_jobs (8)` terminated with an error exit status (127) -- Execution is retried (1)
[ae/c82b60] NOTE: Process `execute_jobs (10)` terminated with an error exit status (127) -- Execution is retried (1)
[a8/693d83] NOTE: Process `execute_jobs (4)` terminated with an error exit status (127) -- Execution is retried (1)
[88/aff74e] NOTE: Process `execute_jobs (15)` terminated with an error exit status (127) -- Execution is retried (1)
[4c/e9351d] NOTE: Process `execute_jobs (11)` terminated with an error exit status (127) -- Execution is retried (1)
[40/8918e1] NOTE: Process `execute_jobs (16)` terminated with an error exit status (127) -- Execution is retried (1)
[0e/991e02] NOTE: Process `execute_jobs (20)` terminated with an error exit status (127) -- Execution is retried (1)

Could you help me?

Thank you very much

Xiaolin

MichaelHiller commented 2 years ago

Bogdan's workflow should resubmit jobs that crash. Do these jobs crash repeatedly and the whole pipeline stops? Pls have a look if these jobs run out of memory or runtime.

kirilenkobm commented 2 years ago

Pls try the following: 1) cd /gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap-macFas/ 2) cat master_script.sh copy all commands that start with export ... (must be 4 commands) 3) execute all export commands 4) cd TEMP_run.lastz 5) head jobList 6) call a couple of randomly selected lastz commands but add -v flag 7) copypaste the entire output here

For some reason, lastz jobs crash, so let's try to figure out what is the reason

Alex-Lin-kiz commented 2 years ago

Hello,

Thanks for your reply. I did as you said and got this result:

Temp directory is needed: False: None
Arg /gpfs/home/liunyw/howler_monkey/make_lastz_chain/macFas.2bit:1:0-175000000 doesnt end with .lst
Arg /gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap.2bit:1:0-50010000 doesnt end with .lst
Target: /gpfs/home/liunyw/howler_monkey/make_lastz_chain/macFas.2bit:1:0-175000000 | Query: /gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap.2bit:1:0-50010000
Target specs: ('/gpfs/home/liunyw/howler_monkey/make_lastz_chain/macFas.2bit', '1', 0, 175000000) | Query specs: ('/gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap.2bit', '1', 0, 50010000)
BLASTZ options: H=2000 Y=9400 L=3000 K=2400
Lastz command: lastz "/gpfs/home/liunyw/howler_monkey/make_lastz_chain/macFas.2bit/1[1,175000000][multiple]" "/gpfs/home/liunyw/howler_monkey/make_lastz_chain/homSap.2bit/1[1,50010000][multiple]" H=2000 Y=9400 L=3000 K=2400 --traceback=800.0M --format=axt+
kirilenkobm commented 1 year ago

If the issue still remains intact in 2.0.0, please recover the issue.