Closed daaaaande closed 11 months ago
@daaaaande I don't think the pysam version is an issue.
The error is happening here:
The error suggests an issue with the bam file - but I wonder if it's a memory issue?
--batch_zmws
from 1000 to a smaller number in this line. We typically use 100.thanks for the swift reply! :) to address your suggestions:
Update:
I'm now trying even smaller chunks (100) and 50 zmw batch size.
Update II:
Thanks for the update @daaaaande - it might be helpful if you provide the command you are running.
@danielecook thanks for the hint, will give you more information:
I1120 19:59:39.677627 47130953445824 quick_inference.py:693] Example summary: ran model=3797 (75.83%; 7.357s) skip=1210 (24.17%; 0.148s) total=5007.
I1120 19:59:39.783081 47130953445824 quick_inference.py:770] Processed a batch of 20 ZMWs in 8.323 seconds
I1120 19:59:39.787414 47130953445824 quick_inference.py:931] Processed 15700 ZMWs in 8150.246 seconds
Traceback (most recent call last):
File "/software/DeepConsensus/1.2.0/skylake/bin/deepconsensus", line 8, in <module>
sys.exit(run())
File "/software/DeepConsensus/1.2.0/icelake/lib/python3.9/site-packages/deepconsensus/cli.py", line 118, in run
app.run(main, flags_parser=parse_flags)
File "/software/DeepConsensus/1.2.0/icelake/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/software/DeepConsensus/1.2.0/icelake/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/software/DeepConsensus/1.2.0/icelake/lib/python3.9/site-packages/deepconsensus/cli.py", line 103, in main
app.run(quick_inference.main, argv=passed)
File "/software/DeepConsensus/1.2.0/icelake/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/software/DeepConsensus/1.2.0/icelake/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/software/DeepConsensus/1.2.0/icelake/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 977, in main
outcome_counter = run()
File "/software/DeepConsensus/1.2.0/icelake/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 912, in run
for zmw, subreads, dc_config, window_widths in input_file_generator:
File "/software/DeepConsensus/1.2.0/icelake/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 480, in stream_bam
for input_data in proc_feeder():
File "/software/DeepConsensus/1.2.0/icelake/lib/python3.9/site-packages/deepconsensus/preprocess/pre_lib.py", line 1309, in proc_feeder
for read_set in subread_grouper:
File "/software/DeepConsensus/1.2.0/icelake/lib/python3.9/site-packages/deepconsensus/preprocess/pre_lib.py", line 73, in __next__
read = next(self.bam_reader)
File "pysam/libcalignmentfile.pyx", line 1876, in pysam.libcalignmentfile.AlignmentFile.__next__
OSError: error -3 while reading file
Error in rule run_deep:
jobid: 12
input: m54274Ue_230926_083040.subreads_4-of-120_ccs_lq_mapped_to_subreads.bam, m54274Ue_230926_083040.subreads_4-of-120_ccs_lq.bam
output: m54274Ue_230926_083040.subreads_4-of-120_deepconsensus.fastq
conda-env: .snakemake/conda/94e59aa9aceab381a4abc6c7f412b7a1_
shell:
module load DeepConsensus &>m54274Ue_230926_083040.subreads_4-of-120_deepconsensus_logfile.log
/software/DeepConsensus/1.2.0/skylake/bin/deepconsensus run --subreads_to_ccs=m54274Ue_230926_083040.subreads_4-of-120_ccs_lq_mapped_to_subreads.bam --ccs_bam=m54274Ue_230926_083040.s
ubreads_4-of-120_ccs_lq.bam --checkpoint=checkpoint --output=m54274Ue_230926_083040.subreads_4-of-120_deepconsensus.fastq --batch_zmws 20 --batch_size=512 --cpus=16 &>m54274Ue_230926_083040.subreads_4-of-120_deepconsensus_logfile.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: 8691257.hpc-batch
Error executing rule run_deep on cluster (jobid: 12, external: 8691257.hpc-batch, jobscript: .snakemake/tmp.8cb82736/snakejob.run
_deep.12.sh). For error details see the cluster log and the log files of the involved rule(s).
ls -lth 4-of-120
-rwxrwx--- 1 daric102 users 282K Nov 20 19:59 m54274Ue_230926_083040.subreads_4-of-120_deepconsensus_logfile.log
-rwxrwx--- 1 daric102 users 5.8G Nov 20 16:23 m54274Ue_230926_083040.subreads_4-of-120_ccs_lq_mapped_to_subreads.bam
-rwxrwx--- 1 daric102 users 19M Nov 20 16:23 m54274Ue_230926_083040.subreads_4-of-120_actc_.log
-rwxrwx--- 1 daric102 users 284M Nov 20 16:19 m54274Ue_230926_083040.subreads_4-of-120_ccs_lq_mapped_to_subreads.fasta
-rwxrwx--- 1 daric102 users 344M Nov 19 13:17 m54274Ue_230926_083040.subreads_4-of-120_ccs_lq.bam
-rwxrwx--- 1 daric102 users 1.5K Nov 19 13:17 m54274Ue_230926_083040.subreads_4-of-120_ccs_lq.ccs_report.txt
-rwxrwx--- 1 daric102 users 2.5M Nov 19 13:17 m54274Ue_230926_083040.subreads_4-of-120_ccs_lq.zmw_metrics.json.gz
-rwxrwx--- 1 daric102 users 219K Nov 19 13:17 m54274Ue_230926_083040.subreads_4-of-120_ccs_lq.bam.pbi
ls -lth 14-of-120
-rwxrwx--- 1 daric102 users 540M Nov 20 19:55 m54274Ue_230926_083040.subreads_14-of-120_deepconsensus.fastq
-rwxrwx--- 1 daric102 users 88 Nov 20 19:54 m54274Ue_230926_083040.subreads_14-of-120_deepconsensus.fastq.inference.json
-rwxrwx--- 1 daric102 users 111K Nov 20 19:54 m54274Ue_230926_083040.subreads_14-of-120_deepconsensus.fastq.runtime.csv
-rwxrwx--- 1 daric102 users 272K Nov 20 19:54 m54274Ue_230926_083040.subreads_14-of-120_deepconsensus_logfile.log
-rwxrwx--- 1 daric102 users 5.5G Nov 20 16:23 m54274Ue_230926_083040.subreads_14-of-120_ccs_lq_mapped_to_subreads.bam
-rwxrwx--- 1 daric102 users 19M Nov 20 16:22 m54274Ue_230926_083040.subreads_14-of-120_actc_.log
-rwxrwx--- 1 daric102 users 276M Nov 20 16:20 m54274Ue_230926_083040.subreads_14-of-120_ccs_lq_mapped_to_subreads.fasta
-rwxrwx--- 1 daric102 users 332M Nov 19 12:30 m54274Ue_230926_083040.subreads_14-of-120_ccs_lq.bam
-rwxrwx--- 1 daric102 users 1.5K Nov 19 12:30 m54274Ue_230926_083040.subreads_14-of-120_ccs_lq.ccs_report.txt
-rwxrwx--- 1 daric102 users 2.4M Nov 19 12:30 m54274Ue_230926_083040.subreads_14-of-120_ccs_lq.zmw_metrics.json.gz
-rwxrwx--- 1 daric102 users 210K Nov 19 12:30 m54274Ue_230926_083040.subreads_14-of-120_ccs_lq.bam.pbi
(test_2) [daric102@hilbert213 deepconsensussnake]$ conda activate .snakemake/conda/94e59aa9aceab381a4abc6c7f412b7a1 (.snakemake/conda/94e59aa9aceab381a4abc6c7f412b7a1_) [daric102@hilbert213 deepconsensus_snake]$ actc --version actc 0.6.0
Using: actc : 0.6.0 (commit v0.6.0) pbbam : 2.4.99 (commit v2.4.0-23-g59248fe) pbcopper : 2.3.99 (commit v2.3.0-28-ga9b1ffa) boost : 1.81 htslib : 1.17 zlib : 1.2.13
samtools stats m54274Ue_230926_083040.subreads_4-of-120_ccs_lq_mapped_to_subreads.bam >m54274Ue_230926_083040.subreads_4-of-120_mapped_stats_.out
Failure while decoding file
module load DeepConsensus
CUDA Toolkit 11.4.3
|--------------------------------------------------------------~o
binaries : /software/CUDA/11.4.3/bin
libraries : /software/CUDA/11.4.3/lib64
includes : /software/CUDA/11.4.3/include
documentation : /software/CUDA/11.4.3/doc
Intel oneAPI-2022.2
|--------------------------------------------------------------~o
C compiler : icc
C++ compiler : icpc
FORTRAN77 compiler : ifort
FORTRAN90 compiler : ifort
FORTRAN95 compiler : ifort
runtime libraries : /software/intel/oneAPI-2022.2/compiler/lib/intel64
Intel Math Kernel Library (2022.2.1.0)
|--------------------------------------------------------------~o
libraries : /software/intel/oneAPI-2022.2/mkl/2022.1.0/lib/intel64
interfaces : /software/intel/oneAPI-2022.2/mkl/2022.1.0/interfaces
IntelMPI (2021.6.0)
|--------------------------------------------------------------~o
C compiler : mpiicc
C++ compiler : mpiicpc
FORTRAN77 compiler : mpiifort
FORTRAN90 compiler : mpiifort
FORTRAN95 compiler : mpiifort
runtime libraries : /software/intel/oneAPI-2022.2/mpi/2021.6.0/intel64/lib
Intel Python Version 3.9.10
|--------------------------------------------------------------~o
binaries : /software/intel/oneAPI-2022.2/intelpython/latest/bin
libs : /software/intel/oneAPI-2022.2/intelpython/latest/lib
includes : /software/intel/oneAPI-2022.2/intelpython/latest/include
DeepConsensus 1.2.0
|--------------------------------------------------------------~o
binaries : /software/DeepConsensus/1.2.0/skylake/bin
"nodes" : 1,
"time" : "8:59:59",
"mem" : "72GB",
"ncpus" : "16",
"ngpus" : "0",
"arch" : "skylake",
...
| 20231120 15:23:01.419 | DEBUG | CLR parser : m54274Ue_230926_083040/6751778/241410_263248
| 20231120 15:23:01.419 | DEBUG | CLR parser : m54274Ue_230926_083040/6751778/263277_269659
| 20231120 15:23:01.419 | FATAL | CCS reader : CCS ZMW 6751779 has multiple records. Ignoring ZMW!
...
Double-Strand Reads Single-Strand Reads
Inputs : 41714 (94.03%) 5292 (5.965%)
Passed : 15604 (37.41%) 4553 (86.04%)
Failed : 26110 (62.59%) 739 (13.96%)
Tandem repeats : 335 (1.283%) 128 (17.32%)
Exclusive failed counts
Below SNR threshold : 715 (2.738%) 0 (0.000%)
Median length filter : 0 (0.000%) 0 (0.000%)
Shortcut filters : 0 (0.000%) 0 (0.000%)
Lacking full passes : 24841 (95.14%) 638 (86.33%)
Coverage drops : 69 (0.264%) 14 (1.894%)
Insufficient draft cov : 101 (0.387%) 16 (2.165%)
Draft too different : 0 (0.000%) 0 (0.000%)
Draft generation error : 374 (1.432%) 68 (9.202%)
Draft above --max-length : 1 (0.004%) 0 (0.000%)
Draft below --min-length : 0 (0.000%) 0 (0.000%)
Reads failed polishing : 0 (0.000%) 0 (0.000%)
Empty coverage windows : 7 (0.027%) 1 (0.135%)
CCS did not converge : 1 (0.004%) 2 (0.271%)
CCS below minimum RQ : 1 (0.004%) 0 (0.000%)
Unknown error : 0 (0.000%) 0 (0.000%)
Additional passing metrics
Missing adapters : 408 (2.615%) 169 (3.712%)
This file was produced by samtools stats (1.18+htslib-1.18) and can be plotted using plot-bamstats
This file contains statistics for all reads.
The command line was: stats m54274Ue_230926_083040.subreads_4-of-120_ccs_lq.bam
CHK, Checksum [2]Read Names [3]Sequences [4]Qualities
CHK, CRC32 of reads which passed filtering followed by addition (32bit overflow)
CHK 31679cf3 96079b4e df23c499
Summary Numbers. Use `grep ^SN | cut -f 2-` to extract this part.
SN raw total sequences: 20157 # excluding supplementary and secondary reads
SN filtered sequences: 0
SN sequences: 20157
SN is sorted: 1
SN 1st fragments: 20157
SN last fragments: 0
SN reads mapped: 0
SN reads mapped and paired: 0 # paired-end technology bit set + both mates mapped
SN reads unmapped: 20157
SN reads properly paired: 0 # proper-pair bit set
SN reads paired: 0 # paired-end technology bit set
SN reads duplicated: 0 # PCR or optical duplicate bit set
SN reads MQ0: 0 # mapped and MQ=0
SN reads QC failed: 0
SN non-primary alignments: 0
SN supplementary alignments: 0
SN total length: 385773399 # ignores clipping
SN total first fragment length: 385773399 # ignores clipping
SN total last fragment length: 0 # ignores clipping
SN bases mapped: 0 # ignores clipping
SN bases mapped (cigar): 0 # more accurate
SN bases trimmed: 0
SN bases duplicated: 0
SN mismatches: 0 # from NM fields
SN error rate: 0.000000e+00 # mismatches / bases mapped (cigar)
SN average length: 19138
SN average first fragment length: 19138
SN average last fragment length: 0
SN maximum length: 46259
SN maximum first fragment length: 46259
SN maximum last fragment length: 0
samtools quickcheck m54274Ue_230926_083040.subreads_4-of-120_ccs_lq_mapped_to_subreads.bam -vvvvv
verbosity set to 5
checking m54274Ue_230926_083040.subreads_4-of-120_ccs_lq_mapped_to_subreads.bam
opened m54274Ue_230926_083040.subreads_4-of-120_ccs_lq_mapped_to_subreads.bam
m54274Ue_230926_083040.subreads_4-of-120_ccs_lq_mapped_to_subreads.bam is sequence data
m54274Ue_230926_083040.subreads_4-of-120_ccs_lq_mapped_to_subreads.bam has 15706 targets in header.
m54274Ue_230926_083040.subreads_4-of-120_ccs_lq_mapped_to_subreads.bam has good EOF block.
if you want i can share loads of logs, just let me know what you think would be the most useful! Also, the ccs_lq_mapped_to_subreads not being parsed by samtools stats gives me the idea that this .bam file ist corrupt/ not according to specs/ confusing deepconsensus - but the quickcheck does not complain. Any idea there?
mini-update: so it seems like i fixed the issue by downgrading actc to 0.2.0. Still need to test more, but as soon as another smrtcell also works with this i will close this issue.
The actc downgrade seems to fix the issue. A second smrtcell also ran without hiccups.
Dear Deepconsensus-Team, im using Deepconsensus 1.2 with chunking of the input through a snakemake pipeline: https://github.com/WestGermanGenomeCenter/deep_snake on a local HPC Cluster.
With chunking into 20 parts, i get the same error on one or more of the chunks while running deepconsensus: ... I1107 14:08:00.658040 48006032286144 quick_inference.py:931] Processed 114000 ZMWs in 95962.634 seconds Traceback (most recent call last): File "/software/DeepConsensus/1.2.0/skylake/bin/deepconsensus", line 8, in
sys.exit(run())
File "/software/DeepConsensus/1.2.0/skylake/lib/python3.9/site-packages/deepconsensus/cli.py", line 118, in run
app.run(main, flags_parser=parse_flags)
File "/software/DeepConsensus/1.2.0/skylake/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/software/DeepConsensus/1.2.0/skylake/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/software/DeepConsensus/1.2.0/skylake/lib/python3.9/site-packages/deepconsensus/cli.py", line 103, in main
app.run(quick_inference.main, argv=passed)
File "/software/DeepConsensus/1.2.0/skylake/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/software/DeepConsensus/1.2.0/skylake/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/software/DeepConsensus/1.2.0/skylake/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 977, in main
outcome_counter = run()
File "/software/DeepConsensus/1.2.0/skylake/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 912, in run
for zmw, subreads, dc_config, window_widths in input_file_generator:
File "/software/DeepConsensus/1.2.0/skylake/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 480, in stream_bam
for input_data in proc_feeder():
File "/software/DeepConsensus/1.2.0/skylake/lib/python3.9/site-packages/deepconsensus/preprocess/pre_lib.py", line 1309, in proc_feeder
for read_set in subread_grouper:
File "/software/DeepConsensus/1.2.0/skylake/lib/python3.9/site-packages/deepconsensus/preprocess/pre_lib.py", line 73, in next
read = next(self.bam_reader)
File "pysam/libcalignmentfile.pyx", line 1876, in pysam.libcalignmentfile.AlignmentFile.next
OSError: error -3 while reading file
The .bam alignment file seems to be ok: ~samtools quickcheck m54274Ue_230924_230713.subreads_15-of-20_ccs_lq_mapped_to_subreads.bam ~ls -lth m54274Ue_230924_230713.subreads_15-of-20_ccs_lq_mapped_to_subreads.bam -rwxrwx--- 1 user userss 40G Nov 2 19:44 m54274Ue_230924_230713.subreads_15-of-20_ccs_lq_mapped_to_subreads.bam
I did not find good information on what exactly error -3 is, but im sure the compute node is fine since this error comes with the same chunk on multiple nodes. Disk quota is also not the problem. The same error also occurs on a local non-hpc server with (uname:)6.2.0-35-generic #35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Oct 6 10:23:26 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Is this related to the pysam version? here its linux-64/pysam-0.22.0-py310h41dec4a_0.tar.bz2 on the local machine and 0.19 on the HPC, both throw the same error, even on the same part