Closed shannonekj closed 3 years ago
Sorry for such a quick update but the thought occurred to me that my cfg
file may have a typo on the line that calls for blocking. I have updated the fc_run.cfg
file to the following:
fc_run.cfg
[General]
input_type = raw
input_fofn = subreads.fa.fofn
#use_tmpdir = scratch
# length cutoff used for seed reads used for initial mapping (default length was 5000, -1 means determine from genome size and seed coverage)
genome_size = 900000000
seed_coverage = 40
length_cutoff = -1
# length cutoff used for seed reads used for pre-assembly
length_cutoff_pr = 10000
falcon_greedy = False
falcon_sense_greedy=False
# concurrency setting
default_concurrent_jobs = 288
pa_concurrent_jobs = 288
cns_concurrent_jobs = 288
ovlp_concurrent_jobs = 288
# overlapping options for Daligner
pa_HPCdaligner_option = -v -B128 -e0.75 -M24 -l1200 -k14 -h256 -w8 -s100 -t16
ovlp_HPCdaligner_option = -v -B128 -M24 -k24 -h600 -e.95 -l1800 -s100
pa_daligner_option = -e0.75 -l1200 -k14 -h256 -w8 -s100
ovlp_daligner_option = -k24 -h600 -e.95 -l1800 -s100
pa_HPCTANmask_option = -k18 -h480 -w8 -e.8 -s100
pa_HPCREPmask_option = -k18 -h480 -w8 -e.8 -s100
#pa_REPmask_code=1,20;10,15;50,10
pa_DBsplit_option = -x500 -s400
ovlp_DBsplit_option = -s400
# error correction consensus option
falcon_sense_option = --output-multi --min-idt 0.70 --min-cov 4 --max-n-read 200 --n-core 24
# overlap filtering options
overlap_filtering_setting = --max-diff 120 --max-cov 120 --min-cov 2 --n-core 12
# slurm options (says sge but not for rEaLz)
#sge_option_da = -pe smp 5 -q bigmem
#sge_option_la = -pe smp 20 -q bigmem
#sge_option_pda = -pe smp 6 -q bigmem
#sge_option_pla = -pe smp 16 -q bigmem
#sge_option_fc = -pe smp 24 -q bigmem
#sge_option_cns = -pe smp 8 -q bigmem
[job.defaults]
job_type = slurm
pwatcher_type = blocking
JOB_QUEUE = default
MB = 40000
NPROC = 12
njobs = 100
submit = srun --wait=0 -p high \
-J ${JOB_NAME} \
-o ${JOB_STDOUT} \
-e ${JOB_STDERR} \
--mem-per-cpu=${MB}M \
--cpus-per-task=${NPROC} \
--time=4-0 \
--ntasks 1 \
--exclusive \
${JOB_SCRIPT}
"${CMD}"
[job.step.da]
NPROC=4
MB=32000
njobs=300
[job.step.la]
NPROC=8
MB=64000
njobs=200
[job.step.cns]
NPROC=8
MB=64000
njobs=200
[job.step.pda]
NPROC=8
MB=64000
njobs=200
[job.step.pla]
NPROC=4
MB=32000
njobs=300
[job.step.asm]
NPROC=24
And now I get the following err
file:
+ cd /group/millermrgrp2/shannon/projects/assembly_genome_Hypomesus-transpacificus/03-assemblies/sandbox
+ fc_run.py fc_run.cfg
falcon-kit 1.8.1 (pip thinks "falcon-kit 1.8.1")
pypeflow 2.3.0
[INFO]Setup logging from file "None".
[INFO]$ lfs setstripe -c 12 /group/millermrgrp2/shannon/projects/assembly_genome_Hypomesus-transpacificus/03-assemblies/sandbox >
[INFO]Apparently '/group/millermrgrp2/shannon/projects/assembly_genome_Hypomesus-transpacificus/03-assemblies/sandbox' is not in lustre filesystem, which is fine.
[INFO]fc_run started with configuration fc_run.cfg
[WARNING]You have several old-style options. These should be provided in the `[job.defaults]` or `[job.step.*]` sections, and possibly renamed. See https://github.com/PacificBiosciences/FALCON/wiki/Configuration
['cns_concurrent_jobs', 'default_concurrent_jobs']
[WARNING]Unexpected keys in input config: {'pa_concurrent_jobs', 'default_concurrent_jobs', 'cns_concurrent_jobs', 'ovlp_concurrent_jobs', 'falcon_greedy'}
[INFO]cfg=
{
"General": {
"LA4Falcon_preload": false,
"avoid_text_file_busy": true,
"bestn": 12,
"cns_concurrent_jobs": "288",
"dazcon": false,
"default_concurrent_jobs": "288",
"falcon_greedy": "False",
"falcon_sense_greedy": false,
"falcon_sense_option": "--output-multi --min-idt 0.70 --min-cov 4 --max-n-read 200 --n-core 24",
"falcon_sense_skip_contained": false,
"fc_ovlp_to_graph_option": " --min-len 10000",
"genome_size": "900000000",
"input_fofn": "subreads.fa.fofn",
"input_type": "raw",
"length_cutoff": "-1",
"length_cutoff_pr": "10000",
"overlap_filtering_setting": "--max-diff 120 --max-cov 120 --min-cov 2 --n-core 12",
"ovlp_DBdust_option": "",
"ovlp_DBsplit_option": "-s400",
"ovlp_HPCdaligner_option": "-v -B128 -M24 -k24 -h600 -e.95 -l1800 -s100",
"ovlp_concurrent_jobs": "288",
"ovlp_daligner_option": "-k24 -h600 -e.95 -l1800 -s100",
"pa_DBdust_option": "",
"pa_DBsplit_option": "-x500 -s400",
"pa_HPCREPmask_option": "-k18 -h480 -w8 -e.8 -s100",
"pa_HPCTANmask_option": "-k18 -h480 -w8 -e.8 -s100",
"pa_HPCdaligner_option": "-v -B128 -e0.75 -M24 -l1200 -k14 -h256 -w8 -s100 -t16",
"pa_REPmask_code": "0,300/0,300/0,300",
"pa_concurrent_jobs": "288",
"pa_daligner_option": "-e0.75 -l1200 -k14 -h256 -w8 -s100",
"pa_dazcon_option": "-j 4 -x -l 500",
"pa_fasta_filter_option": "streamed-internal-median",
"pa_subsample_coverage": 0,
"pa_subsample_random_seed": 12345,
"pa_subsample_strategy": "random",
"seed_coverage": "40",
"skip_checks": false,
"target": "assembly"
},
"job.defaults": {
"JOB_QUEUE": "default",
"MB": "40000",
"NPROC": "12",
"job_type": "slurm",
"njobs": "100",
"pwatcher_type": "blocking",
"submit": "srun --wait=0 -p high \\\n-J ${JOB_NAME} \\\n-o ${JOB_STDOUT} \\\n-e ${JOB_STDERR} \\\n--mem-per-cpu=${MB}M \\\n--cpus-per-task=${NPROC} \\\n--time=4-0 \\\n--ntasks 1 \\\n--exclusive \\\n${JOB_SCRIPT}\n\"${CMD}\"",
"use_tmpdir": false
},
"job.step.asm": {
"NPROC": "24"
},
"job.step.cns": {
"MB": "64000",
"NPROC": "8",
"njobs": "200"
},
"job.step.da": {
"MB": "32000",
"NPROC": "4",
"njobs": "300"
},
"job.step.dust": {},
"job.step.la": {
"MB": "64000",
"NPROC": "8",
"njobs": "200"
},
"job.step.pda": {
"MB": "64000",
"NPROC": "8",
"njobs": "200"
},
"job.step.pla": {
"MB": "32000",
"NPROC": "4",
"njobs": "300"
}
}
[INFO]In simple_pwatcher_bridge, pwatcher_impl=<module 'pwatcher.blocking' from '/home/sejoslin/miniconda3/envs/asm_pacbio/lib/python3.7/site-packages/pwatcher/blocking.py'>
[INFO]job_type='slurm', (default)job_defaults={'job_type': 'slurm', 'pwatcher_type': 'blocking', 'JOB_QUEUE': 'default', 'MB': '40000', 'NPROC': '12', 'njobs': '100', 'submit': 'srun --wait=0 -p high \\\n-J ${JOB_NAME} \\\n-o ${JOB_STDOUT} \\\n-e ${JOB_STDERR} \\\n--mem-per-cpu=${MB}M \\\n--cpus-per-task=${NPROC} \\\n--time=4-0 \\\n--ntasks 1 \\\n--exclusive \\\n${JOB_SCRIPT}\n"${CMD}"', 'use_tmpdir': False}, use_tmpdir=False, squash=False, job_name_style=0
[INFO]Setting max_jobs to 100; was None
[INFO]Num unsatisfied: 0, graph: 2
[INFO]Setting max_jobs to 300; was 100
[INFO]Num unsatisfied: 0, graph: 81
[INFO]Setting max_jobs to 100; was 300
[INFO]Parsed pa_REPmask_code (repa,repb,repc): [(0, 300), (0, 300), (0, 300)]
[INFO]Num unsatisfied: 0, graph: 83
[INFO]Setting max_jobs to 300; was 100
[INFO]Num unsatisfied: 0, graph: 86
[INFO]Setting max_jobs to 100; was 300
[INFO]Num unsatisfied: 0, graph: 88
[INFO]Setting max_jobs to 200; was 100
[INFO]Num unsatisfied: 0, graph: 399
[INFO]Setting max_jobs to 100; was 200
[INFO]Num unsatisfied: 0, graph: 401
[INFO]Setting max_jobs to 300; was 100
[INFO]Num unsatisfied: 0, graph: 712
[INFO]Setting max_jobs to 100; was 300
[INFO]Num unsatisfied: 2, graph: 714
[INFO]About to submit: Node(0-rawreads/repa/rep-combine)
[INFO]Popen: '/bin/bash -C /home/sejoslin/miniconda3/envs/asm_pacbio/lib/python3.7/site-packages/pwatcher/mains/job_start.sh >| /group/millermrgrp2/shannon/projects/assembly_genome_Hypomesus-transpacificus/03-assemblies/sandbox/0-rawreads/repa/rep-combine/run-Pf571d3d49d8d4a.bash.stdout 2>| /group/millermrgrp2/shannon/projects/assembly_genome_Hypomesus-transpacificus/03-assemblies/sandbox/0-rawreads/repa/rep-combine/run-Pf571d3d49d8d4a.bash.stderr'
[INFO](slept for another 0.0s -- another 1 loop iterations)
[INFO](slept for another 0.30000000000000004s -- another 2 loop iterations)
[INFO](slept for another 1.2000000000000002s -- another 3 loop iterations)
[ERROR]Task Node(0-rawreads/repa/rep-combine) failed with exit-code=1
[ERROR]Some tasks are recently_done but not satisfied: {Node(0-rawreads/repa/rep-combine)}
[ERROR]ready: set()
submitted: set()
[ERROR]Noop. We cannot kill blocked threads. Hopefully, everything will die on SIGTERM.
Traceback (most recent call last):
File "/home/sejoslin/miniconda3/envs/asm_pacbio/bin/fc_run.py", line 11, in <module>
load_entry_point('falcon-kit==1.8.1', 'console_scripts', 'fc_run.py')()
File "/home/sejoslin/miniconda3/envs/asm_pacbio/lib/python3.7/site-packages/falcon_kit/mains/run1.py", line 706, in main
main1(argv[0], args.config, args.logger)
File "/home/sejoslin/miniconda3/envs/asm_pacbio/lib/python3.7/site-packages/falcon_kit/mains/run1.py", line 73, in main1
input_fofn_fn=input_fofn_fn,
File "/home/sejoslin/miniconda3/envs/asm_pacbio/lib/python3.7/site-packages/falcon_kit/mains/run1.py", line 269, in run
letter, group_size, coverage_limit)
File "/home/sejoslin/miniconda3/envs/asm_pacbio/lib/python3.7/site-packages/falcon_kit/mains/run1.py", line 627, in add_rep_tasks
daligner_split_script=pype_tasks.TASK_DB_REP_DALIGNER_SPLIT_SCRIPT,
File "/home/sejoslin/miniconda3/envs/asm_pacbio/lib/python3.7/site-packages/falcon_kit/mains/run1.py", line 524, in add_daligner_and_merge_tasks
dist=Dist(NPROC=4, MB=4000, job_dict=daligner_job_config),
File "/home/sejoslin/miniconda3/envs/asm_pacbio/lib/python3.7/site-packages/falcon_kit/pype.py", line 106, in gen_parallel_tasks
wf.refreshTargets()
File "/home/sejoslin/miniconda3/envs/asm_pacbio/lib/python3.7/site-packages/pypeflow/simple_pwatcher_bridge.py", line 278, in refreshTargets
self._refreshTargets(updateFreq, exitOnFailure)
File "/home/sejoslin/miniconda3/envs/asm_pacbio/lib/python3.7/site-packages/pypeflow/simple_pwatcher_bridge.py", line 362, in _refreshTargets
raise Exception(msg)
Exception: Some tasks are recently_done but not satisfied: {Node(0-rawreads/repa/rep-combine)}
and here is my all.log
all.log
You'd have to look at stderr under 0-rawreads/repa/rep-combine
.
Hi all,
I am running FALCON on a cluster with slurm and have come into a few errors when trying to submit the
fc_run.sh
script and let the assembly go. I will briefly describe the first, as it may be relevant to why I am getting the errors now... but obviously I am uncertain. I believe it is the same error at #707The first exception that led to my job failing was an issue with the
mypwatcher/wrappers/*.bash
not being executable and slurm failing to enqueue the jobs. My solution was to manually make all.bash
files executable withchmod a+x mypwatcher/wrappers/*.bash
I kept doing this every time a job failed and it seemingly proceeded okay for the first 10-15 times I had to resubmit (the output files all had text and things went proceeded!). Then I got tired of having tochmod
every file so I tried to look up a solution to automatically making any file in themypwatcher/wrappers/
directory executable and ranchmod -R 775 mypwatcher/wrappers/
(I'm not sure if this would affect anything that I'm currently seeing becuase it looks like thewrappers
directory has the same permissions [drwxrwsr-x
] as the other directories inmypwatcher/
but it is a command I ran and may be relevant). A few minutes after I ran that command I received a different error in myall.log
anderr
files:NOTE I found a typo in my
cfg
file (I switchedpwatch_type = blocking
topwatcher_type = blocking
) and have attached the updatedcfg
,err
andall.log
below as my scripts are still failing. I left the originals up for reference purpose--hope that is okay.all.log
fc_run.j20743460.err
fc_run.cfg
I see in the
err
file that despite me havingpwatch_type = blocking
in my cfg file it still calls onfs_based
(see line 61 oferr
file) –– could this account for any of my issues?? If so, how might I correct??Thank you so much for your assistance!
Shannon