Open lmanchon opened 6 years ago
Hi @lmanchon. Do you really have lowercase @sq
? They should be @SQ
.
Hi,
the lower case sq is most likely an artefact introduced by github. The error line reported here is only triggered if the BAM header has different length values in the text header (as seen in SAM) and the (redundant) binary version stored in the BAM file. How was the file created? It's broken on a very basic level. You may be able to convert it to SAM using some program which does not check for such anomalies. I'd try samtools view.
Best, German
--Hi,
same error on sam file: Program: samtools (Tools for alignments in the SAM format) Version: 1.3.1 (using htslib 1.3.1) samtools view -h -o D6_DMSO_sorted.sam D6_DMSO_sorted.bam
BIOBAMBAM/bin/bamtofastq filename=D6_DMSO_sorted.sam inputformat=sam gz=1 F=D6_DMSO_R1.fastq.gz F2=D6_DMSO_R2.fastq.gz O=orphan_D6_DMSO_1.fastq.gz O2=orphan_D6_DMSO_2.fastq.gz BAM header is not consistent (binary and text do not match) for @SQ SN:8 LN:11660 LIBMAUS2/lib/libmaus2.so.2(libmaus2::util::StackTrace::StackTrace()+0x54)[0x7f3fa478a414] BIOBAMBAM/bin/bamtofastq(libmaus2::exception::LibMausException::LibMausException[0x437f00]
Hi,
could you post the complete SAM header?
Thanks
Hi,
you have sequence 8 appearing twice in your SAM header. biobambam2 should provide a more sensible error message for such cases, but it's definitely a broken input file.
> diff -c <(awk < sam_header.txt '/^@SQ/{print $2}' | sort) <(awk < sam_header.txt '/^@SQ/{print $2}' | sort -u)
*** /dev/fd/63 2017-12-18 11:37:07.377791198 +0100
--- /dev/fd/62 2017-12-18 11:37:07.377791198 +0100
***************
*** 3,9 ****
SN:15
SN:20
SN:8
- SN:8
SN:A01Shi21
SN:A09Calca
SN:A0DBoliv
--- 3,8 ----
yes, but why others tools such as bedtools bam2fastq, bam2FastQ (BamUtil) or bam2fastq(https://github.com/jts/bam2fastq) are able to process this broken file ?
@lmanchon Please report this duplicated line to the author of the tool which produced the broken BAM file.
--Hi,
strange error using bam2fastq:
BIOBAMBAM/bin/bamtofastq filename=D1_sorted.bam inputformat=bam gz=1 F=D1_R1.fastq.gz F2=D1_R2.fastq.gz O=orphan_D1_1.fastq.gz O2=orphan_D1_2.fastq.gz
BAM header is not consistent (binary and text do not match) for @SQ SN:8 LN:11660
LIBMAUS2/lib/libmaus2.so.2(libmaus2::util::StackTrace::StackTrace()+0x54)[0x7f71ec496414] BIOBAMBAM/bin/bamtofastq(libmaus2::exception::LibMausException::LibMausException()+0x20)[0x437f00]/BIOBAMBAM/bin/bamtofastq(libmaus2::bambam::BamHeader::initSetup()+0xbbb)[0x49a55b] BIOBAMBAM/bin/bamtofastq(void libmaus2::bambam::BamHeader::init(libmaus2::lz::BgzfInflateStream&)+0x299)[0x49ec19]
BIOBAMBAM/bin/bamtofastq(libmaus2::bambam::BamDecoderWrapper::BamDecoderWrapper(std::unique_ptr<libmaus2::aio::InputStream, std::default_delete >&, bool)+0x341)[0x49f901]
BIOBAMBAM/bin/bamtofastq(libmaus2::bambam::BamAlignmentDecoderFactory::construct(std::istream&, std::string const&, std::string const&, unsigned long, std::string const&, bool, std::ostream, std::string const&)+0xd2d)[0x4a07bd]
BIOBAMBAM/bin/bamtofastq(libmaus2::bambam::BamMultiAlignmentDecoderFactory::construct(libmaus2::util::ArgInfo const&, bool, std::ostream, std::istream&, bool, bool)+0x381)[0x4a1fa1]
BIOBAMBAM/bin/bamtofastq(bamtofastqCollating(libmaus2::util::ArgInfo const&)+0x5de)[0x43344e]
BIOBAMBAM/bin/bamtofastq(bamtofastq(libmaus2::util::ArgInfo const&)+0x3c1)[0x4344c1]
BIOBAMBAM/bin/bamtofastq(main+0x1a02)[0x42cbc2]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f71ea906b45]
BIOBAMBAM/bin/bamtofastq()[0x42e79f]
do i need to reformat my bam input file, but how ?
thank you --