Closed fengli-eGen closed 2 years ago
Hello,
datander was killed, which generally indicates a memory issue. if you can allocate more memory to the process, I would advise to do so.
uow.sh: line 1: 76103 Killed datander -v -P. raw_reads.233 raw_reads.234 raw_reads.235 raw_reads.236
Also 170X CLR is probably much more coverage than necessary. Downsampling to <100X might also help with your issue.
See the damasker
repo for more information on datander
:
https://github.com/thegenemyers/DAMASKER
Thank you for your quick reply! I'll increase my memory. Do you have any recommendations for downsampling to <100X? I'm wondering if I can downsample to keep the longer reads.
If your reads are in fasta format, I recommend seqtk to subsample: https://www.biostars.org/p/110107/#110248
If they are still in bam format, you can use samtools with the -s
parameter to subsample
http://www.htslib.org/doc/samtools-view.html
You can't subsample & filter reads at the same time as far as i'm aware. you'll need to get a list of readlengths and calculate for yourself the percent you want to retain in order to hit a certain coverage / read length threshold
Hi! Since I was trying hgap4 to run falcon, I found that hgap4 in smrtlink provided a downsampling option. To do a test, I downsampled a lot so that only 1/6 of the raw data was used (please see the code below: --task-option downsample_factor=6). However, with this low amount of reads, I still got the same error related to multiprocessing issue. Thus, I think this might not be related to the large amount data I used initially?
Here is the cmd I ran:
$SMRT_ROOT/smrtcmds/bin/pbcromwell run pb_hgap4 \
-e /data2/raw_reads/subreadset.xml \
--task-option hgap4_aggressive_asm=False \
--task-option hgap4_genome_length=2000000000 \
--task-option hgap4_seed_coverage=30
--task-option hgap4_seed_length_cutoff=-1 \
--task-option hgap4_falcon_advanced="" \
--task-option consensus_algorithm="arrow" \
--task-option dataset_filters="" \
--task-option downsample_factor=6 \
--task-option mapping_min_concordance=70.0 \
--task-option mapping_min_length=50 \
--task-option mapping_biosample_name="" \
--task-option mapping_pbmm2_overrides="" \
--task-option consolidate_aligned_bam=False \
--config /data2/falcon_test/local.cromwell.conf \
--nproc 64
Error message in downsample1/cromwell_out/cromwell-executions/pb_hgap4/6ef4ad19-0ce6-4b40-b5f7-85759ca5e73a/call-falcon/falcon/a6452512-46a6-4c9d-9318-84d8ea229e58/call-task__0_rawreads__tan_apply/shard-16/execution/stderr
+ python3 -m falcon_kit.mains.cromwell_run_uows_tar --nproc=4 --nproc-per-uow=4 --uows-tar-fn=/data2/falcon_test/downsample1/cromwell_out/cromwell-executions/pb_hgap4/6ef4ad19-0ce6-4b40-b5f7-85759ca5e73a/ca
ll-falcon/falcon/a6452512-46a6-4c9d-9318-84d8ea229e58/call-task__0_rawreads__tan_apply/shard-16/inputs/-461551164/some-units-of-work.16.tar --tool=datander
falcon-kit 1.8.1 (pip thinks "falcon-kit 1.8.1+git.449fe5cb421c39a39795b4889d6ba47d459dfc9d")
pypeflow 2.3.0+git.03eda6364441793b24845ef5b8d1ef8c58ce1cf4
INFO:root:For multiprocessing, parallel njobs=1 (cpu_count=64, nproc=4, nproc_per_uow=4)
INFO:root:$('tar --strip-components=1 -xvf /data2/falcon_test/downsample1/cromwell_out/cromwell-executions/pb_hgap4/6ef4ad19-0ce6-4b40-b5f7-85759ca5e73a/call-falcon/falcon/a6452512-46a6-4c9d-9318-84d8ea229e
58/call-task__0_rawreads__tan_apply/shard-16/inputs/-461551164/some-units-of-work.16.tar')
INFO:root:Started a worker in 55894 from parent 55879
INFO:root:running 1 units-of-work, 1 at a time...
[55894]starting run_uow('./uow-0016')
[55894]maxrss: 22012
INFO:root:CD: './uow-0016' <- '/data2/falcon_test/downsample1/cromwell_out/cromwell-executions/pb_hgap4/6ef4ad19-0ce6-4b40-b5f7-85759ca5e73a/call-falcon/falcon/a6452512-46a6-4c9d-9318-84d8ea229e58/call-task
__0_rawreads__tan_apply/shard-16/execution'
INFO:root:$('bash -vex uow.sh')
datander -v -P. raw_reads.65 raw_reads.66 raw_reads.67 raw_reads.68
+ datander -v -P. raw_reads.65 raw_reads.66 raw_reads.67 raw_reads.68
uow.sh: line 1: 55900 Killed datander -v -P. raw_reads.65 raw_reads.66 raw_reads.67 raw_reads.68
WARNING:root:Call 'bash -vex uow.sh' returned 35072.
INFO:root:CD: './uow-0016' -> '/data2/falcon_test/downsample1/cromwell_out/cromwell-executions/pb_hgap4/6ef4ad19-0ce6-4b40-b5f7-85759ca5e73a/call-falcon/falcon/a6452512-46a6-4c9d-9318-84d8ea229e58/call-task
__0_rawreads__tan_apply/shard-16/execution'
**ERROR:root:failed multiprocessing
multiprocessing.pool.RemoteTraceback:**
"""
Traceback (most recent call last):
File "/data2/src/smrtlink/install/smrtlink-release_10.2.0.133434/bundles/smrttools/install/smrttools-release_10.2.0.133434/private/thirdparty/python3/python3_3.9.6/site-packages/falcon_kit/util/io.py", li
ne 68, in run_func
ret = func(*args)
File "/data2/src/smrtlink/install/smrtlink-release_10.2.0.133434/bundles/smrttools/install/smrttools-release_10.2.0.133434/private/thirdparty/python3/python3_3.9.6/site-packages/falcon_kit/mains/cromwell_
run_uows_tar.py", line 18, in run_uow
io.syscall(cmd)
File "/data2/src/smrtlink/install/smrtlink-release_10.2.0.133434/bundles/smrttools/install/smrttools-release_10.2.0.133434/private/thirdparty/python3/python3_3.9.6/site-packages/pypeflow/io.py", line 27,
in syscall
raise Exception(msg)
Exception: Call 'bash -vex uow.sh' returned 35072.
I wonder if the error may be related to the configuration file since it's local instead of sge on AWS.. Here is the configuration I used. File: /data2/falcon_test/local.cromwell.conf attached config here. local.cromwell.conf.txt
Thank you so much for your advice!
Hi, I'm running falcon through smrtlink pbcromwell hgap4 on AWS on an instance with 64 vCPU and 128G RAM.
The genome size I'm analyzing is ~2.8GB and PacBio CLR reads cover the genome ~170X. SMRTlink v10.2 was installed and used.
It got stuck at Falcon daligner step when it was trying to run multiprocessing. It looks like it generated ~100 shared-XX folders at the step of call-task0_rawreadstan_apply: for example, /data2/falcon_test/cromwell_out/cromwell-executions/pb_hgap4/c85def77-cd12-450e-9062-3238e03d1c6c/call-falcon/falcon/e47b1c37-3cd2-4f4e-abe6-71a4140ca1f4/call-task0_rawreadstan_apply/shared-XX. They have similar errors in stderr and I attach an example under shared-21/execution/stderr, stdout and script in this execution.
When I was troubleshooting, I tried to run this cmd for just this one share-21 and it completed without this error. Cmd used for this troubleshoting:
I searched around and was not able to figure this out. I wonder if you have some ideas. I checked if multiprocessing was installed successfully in smrtlink_v10.2 python3 (I tested it by running from multiprocessing import Pool from python3), and it was indeed installed. But not sure why the program couldn’t make it run.
CMD kicked off for assembly (since this step got stuck at falcon within HGAP4 pipeline, I think it might be an error of falcon)
I wonder if you know what might go wrong and why multiprocessing is not working.
Thank you so much!