evotools / nf-LO

A Nextflow workflow to generate lift over files for any pair of genomes
https://nf-lo.readthedocs.io/
MIT License
53 stars 10 forks source link

Too large of a bash script for slurm in chainMerge #2

Closed mchaisso closed 2 years ago

mchaisso commented 2 years ago

Hi, I'm running a job on to mammalian genomes.

nextflow run ../bosTauTomBal/nf-LO --source ../bosTau6.fa --target /project/mchaisso_100/projects/Whales/mPhosSin1.pri/assembly.orig.fasta --max-cpus 16 --max_memory 64.GB -with-trace -with-report -resume -profile slurm,conda

At the step: [54/e0d255] process > ALIGNER:chainMerge (chainmerge) [100%] 3 of 3, failed: 3, retries: 2 [- ] process > ALIGNER:chainNet - [- ] process > ALIGNER:netSynt - [- ] process > ALIGNER:chainsubset - [- ] process > ALIGNER:chain2maf - [- ] process > ALIGNER:name_maf_seq - [- ] process > ALIGNER:mafstats - [9d/5d42ff] NOTE: Error submitting process 'ALIGNER:chainMerge (chainmerge)' for execution -- Execution is retried (1) [55/e09c1b] NOTE: Error submitting process 'ALIGNER:chainMerge (chainmerge)' for execution -- Execution is retried (2) [54/e0d255] NOTE: Error submitting process 'ALIGNER:chainMerge (chainmerge)' for execution -- Error is ignored

slurm is not able to submit the command: sbatch $PWD/work/54/e0d255bdfcaa440fbfe84e1e09cee3/.command.run sbatch: error: Batch job submission failed: Pathname of a file, directory or other parameter too long

The problem is the file is too large (5.3M):

-rw-rw---- 1 mchaisso mchaisso_100 5.3M Dec 29 19:09 work/54/e0d255bdfcaa440fbfe84e1e09cee3/.command.run

I think the default slurm max script is 4.5 M, and I don't have permissions to increase this. The problem is in the zillions of rm and ln lines in nxf_stage(). Ideally nextflow could do the work of creating a script that runs the rm and ln's, and to submit that.

mchaisso commented 2 years ago

I'm finding this is a nextflow issue, not nf-lo. Looking for workaround to post here.

mchaisso commented 2 years ago

For now, just rerunning the last steps locally. It's as automated enough for me.

RenzoTale88 commented 2 years ago

@mchaisso thank you for using nf-LO. You might want to try increase the size of each chunk for the alignment when dealing with large, mammalian-sized genomes. For example, using --srcSize 30000000 --tgtSize 10000000 --tgtOvlp 100000 creates fragments of 30Mb for the source genome and align them to subsequences of 10Mb with 100Kb overlaps between them. This should reduce the number of fragments to a few thousands, therefore reducing the number of files to combine and to remove afterward. If I may, since you're aligning two different species, you might want to follow the guidelines here to define the right pre-set/customized configuration for your analyses. Hope this help, Andrea