PacificBiosciences / FALCON_unzip

Making diploid assembly becomes common practice for genomic study
BSD 3-Clause Clear License
30 stars 18 forks source link

How to set NPROC, MB, and njobs according to my computation enviroment? #166

Closed llengcn closed 3 years ago

llengcn commented 3 years ago

Dear Falcon developers,

I installed falcon via conda and was able to run the E coli sample data successfully. When I tried to use my own data, I had some uncertainty about modifying NPROC, MB, and njobs. I run falcon in a fat node using local mode, this fact node has 128 cpu cores, 1.5 TB memory and 50 TB harddisk. I am assembling a ~500M genome with ~175 GB pacbio sequel II data. Currently my fc_run.cfg is as follows:

[job.defaults] job_type = local use_tmpdir = ./ submit = bash -C ${CMD} >| ${STDOUT_FILE} 2>| ${STDERR_FILE} [General] pwatcher_type = blocking input_type = raw input_fofn = subreads.fasta.fofn genome_size=500000000 seed_coverage = 40 length_cutoff = -1 length_cutoff_pr = 12000 falcon_greedy = False falcon_sense_greedy=False pa_daligner_option = -e0.76 -l1200 -k18 -h480 -w8 -s100 ovlp_daligner_option = -k24 -h480 -e.95 -l1800 -s100 pa_HPCdaligner_option = -v -B128 -M24 ovlp_HPCdaligner_option = -v -B128 -M24 pa_HPCTANmask_option = -k18 -h480 -w8 -e.8 -s100 pa_HPCREPmask_option = -k18 -h480 -w8 -e.8 -s100

pa_REPmask_code=1,20;10,15;50,10

pa_DBsplit_option = -x500 -s200 ovlp_DBsplit_option = -s400 falcon_sense_option = --output_multi --min_idt 0.70 --min_cov 4 --max_n_read 200 --n_core 100 overlap_filtering_setting = --max_diff 120 --max_cov 120 --min_cov 4 --n_core 100 [job.step.da] NPROC=100 MB=32000 njobs=100 [job.step.la] NPROC=100 MB=64000 njobs=100 [job.step.cns] NPROC=100 MB=64000 njobs=100 [job.step.pda] NPROC=100 MB=64000 njobs=100 [job.step.pla] NPROC=100 MB=32000 njobs=100 [job.step.asm] NPROC=100 MB=384000 njobs=1

So, how can I modify parameters like NPROC, MB, and njobs to maximize the speed according to my computation environment? Or from where could I find hints for this purpose?

Thank you very much in advance!

liang

pb-cdunn commented 3 years ago

Maybe the pypeflow wiki helps?