Open mictadlo opened 4 months ago
Likely not, but my understanding is that we use NextFlow to schedule the jobs. So if NextFlow can communicate with PBSpro, it may work.
According to the Nextflow documentation PBSpro is supported. However, I failed to get it running in the following way:
> ./make_chains.py target query test_data/test_reference.fa test_data/test_query.fa --pd test_out -f --chaining_memory 16 --cluster_executor pbspro --cluster_queue test
# Make Lastz Chains #
Version 2.0.8
Commit: 187e313afc10382fe44c96e47f27c4466d63e114
Branch: main
* found run_lastz.py at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/standalone_scripts/run_lastz.py
* found run_lastz_intermediate_layer.py at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/standalone_scripts/run_lastz_intermediate_layer.py
* found chain_gap_filler.py at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/standalone_scripts/chain_gap_filler.py
* found faToTwoBit at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/faToTwoBit
* found twoBitToFa at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/twoBitToFa
* found pslSortAcc at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/pslSortAcc
* found axtChain at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/axtChain
* found axtToPsl at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/axtToPsl
* found chainAntiRepeat at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainAntiRepeat
* found chainMergeSort at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainMergeSort
* found chainCleaner at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainCleaner
* found chainSort at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainSort
* found chainScore at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainScore
* found chainNet at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainNet
* found chainFilter at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/HL_kent_binaries/chainFilter
* found lastz at /work/waterhouse_team/miniconda2/envs/makeLastzChains/bin/lastz
* found nextflow at /home/lorencm/bin/nextflow
All necessary executables found.
Making chains for test_data/test_reference.fa and test_data/test_query.fa files, saving results to /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out
Pipeline started at 2024-05-10 08:46:21.499906
* Setting up genome sequences for target
genomeID: target
input sequence file: test_data/test_reference.fa
is 2bit: False
planned genome dir location: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/target.2bit
Initial fasta file test_data/test_reference.fa saved to /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/target.2bit
For target (target) sequence file: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/target.2bit; chrom sizes saved to: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/target.chrom.sizes
* Setting up genome sequences for query
genomeID: query
input sequence file: test_data/test_query.fa
is 2bit: False
planned genome dir location: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/query.2bit
Initial fasta file test_data/test_query.fa saved to /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/query.2bit
For query (query) sequence file: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/query.2bit; chrom sizes saved to: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/query.chrom.sizes
### Partition Step ###
# Partitioning for target
Saving partitions and creating 1 buckets for lastz output
In particular, 0 partitions for bigger chromosomes
And 1 buckets for smaller scaffolds
Saving target partitions to: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/target_partitions.txt
# Partitioning for query
Saving partitions and creating 1 buckets for lastz output
In particular, 0 partitions for bigger chromosomes
And 1 buckets for smaller scaffolds
Saving query partitions to: /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/query_partitions.txt
Num. target partitions: 0
Num. query partitions: 0
Num. lastz jobs: 0
### Lastz Alignment Step ###
LASTZ: making jobs
LASTZ: saved 1 jobs to /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/lastz_joblist.txt
Parallel manager: pushing job /home/lorencm/bin/nextflow /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/parallelization/execute_joblist.nf --joblist /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/lastz_joblist.txt -c /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/lastz_config.nf
N E X T F L O W ~ version 23.10.1
Launching `/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/parallelization/execute_joblist.nf` [gigantic_lorenz] DSL2 - revision: 0483b29723
[12/2b01f3] process > execute_jobs (1) [100%] 4 of 4, failed: 4, retries: 3
[1c/6ff42d] NOTE: Error submitting process 'execute_jobs (1)' for execution -- Execution is retried (1)
[46/bab438] NOTE: Error submitting process 'execute_jobs (1)' for execution -- Execution is retried (2)
[4b/caee10] NOTE: Error submitting process 'execute_jobs (1)' for execution -- Execution is retried (3)
ERROR ~ Error executing process > 'execute_jobs (1)'
Caused by:
Failed to submit process to grid scheduler for execution
Command executed:
qsub -N nf-execute_jobs .command.run
Command exit status:
159
Command output:
qsub: Unauthorized Request
Work dir:
/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/work/12/2b01f39c7ef951786a32513d22ccc9
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
-- Check '.nextflow.log' file for details
### Error! The nextflow process lastz crashed!
Please look at the logs in the /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run
An error occurred while executing lastz: Jobs for lastz at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/lastz_joblist.txt died
Traceback (most recent call last):
File "/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/modules/step_manager.py", line 70, in execute_steps
step_result = step_to_function[step](params, project_paths, step_executables)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/modules/pipeline_steps.py", line 52, in lastz_step
do_lastz(params, project_paths, executables)
File "/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/steps_implementations/lastz_step.py", line 99, in do_lastz
execute_nextflow_step(
File "/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/parallelization/nextflow_wrapper.py", line 157, in execute_nextflow_step
nextflow_manager.check_failed()
File "/mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/parallelization/nextflow_wrapper.py", line 109, in check_failed
raise NextflowProcessError(f"Jobs for {self.label} at {self.joblist_path} died")
modules.error_classes.NextflowProcessError: Jobs for lastz at /mnt/hpccs01/work/waterhouse_team/apps/make_lastz_chains/test_out/temp_lastz_run/lastz_joblist.txt died
> less test_out/.nextflow.log
test_out/.nextflow.log: No such file or directory
What did I do wrong?
Best wishes,
Michal
Sorry, I don't know. I have 0 experience with PBSpro.
Hi! Sorry for the hitchhiking.
I also had trouble running make_lastz_chains
on an HPC that runs PBS, likely due to some internal configuration of the HPC. After trial and error, I ended up running make_lastz_chains
(the original v.1.0.0) by submitting the entire job to a single computing node with multiple (N) cores in the HPC, with --executor local --executor_queuesize $N
(--executor local
can be omitted since that's the default).
In my case, a node with N=32 was good enough for the alignment of mammalian-size genomes (or any genomes <16Gb), and there are steps where RAM appears to matter more than the number of threads.
If you have some computing nodes with a reasonable number of cores, perhaps this approach would work?
Cheers, Dong-Ha
Thanks for the feedback. Of course running it on a single node may work. These days CPUs have 128 or 192 cores. It will take a few days to finish though.
Maybe @kirilenkobm has insights in PBSpro or how to fix the problem?
Hi @ohdongha, How much memory did you need for your mammalian-size genomes? I want to run it on a 3GB allotetraploid plant.
Best wishes,
Michal
Hi @ohdongha, How much memory did you need for your mammalian-size genomes? I want to run it on a 3GB allotetraploid plant.
I typically ask for 360 GB and 32-core, to be on the safe side. In most cases, max_vmem
does not exceed 200GB. I think the key, which @MichaelHiller also always emphasizes, is to soft-mask the repeats as much as possible.
Cheers, Dong-Ha
Hi, Our HPC uses PBSpro. Does
make_lastz_chains
support PBSpro?Best wishes,
MIchal