Closed JPascualAnaya closed 7 years ago
Hi there,
I repeated all using the sam file, and changing all whitespaces in the headers of the assembly fasta to underscores, and I still get the same issue (even though I got all the reads mapped...)
$ head baxh5_vs_TtraCanu1.default.blasr.sam.cov
## blobtools v0.9.19.3
## Total Reads = 17401374
## Mapped Reads = 17401374
## Unmapped Reads = 0
## Source(s) : /home/champi/Documents/TerebrataliaGenome/assemblies/canu/blobologyFiltering/baxh5_vs_TtraCanu1.default.blasr.sam
# contig_id read_cov base_cov
tig00000000_len=13620_reads=12_covStat=17.50_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00000001_len=19185_reads=11_covStat=33.92_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00000002_len=21833_reads=37_covStat=35.63_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00000003_len=34409_reads=65_covStat=62.81_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
$ tail baxh5_vs_TtraCanu1.default.blasr.sam.cov
tig00011780_len=15717_reads=44_covStat=21.52_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00011783_len=7536_reads=6_covStat=14.11_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00011786_len=2405_reads=14_covStat=-3.87_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00011789_len=4318_reads=5_covStat=7.03_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00011792_len=9055_reads=7_covStat=20.15_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00011802_len=20122_reads=41_covStat=30.93_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00011805_len=4087_reads=8_covStat=4.58_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00011808_len=37484_reads=88_covStat=51.07_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00011811_len=23810_reads=43_covStat=40.01_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
tig00011812_len=17599_reads=21_covStat=43.39_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no 0 0.0
Hi JPascualAnaya ,
Blobtools needs the IDs of sequences (headers in FASTA, qseqids in BLAST report, contigIDs in BAM) to be the same. That's why it complains.
cheers,
dom
Hi Dom,
As fas as I can tell, after changing the whitespaces for underscores in the assembly and rerunning all, the names are exactly the same.
$ grep ">" Ttra_canu1.contigs.1line_nospaceID.fasta | head
>tig00000000_len=13620_reads=12_covStat=17.50_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no
>tig00000001_len=19185_reads=11_covStat=33.92_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no
>tig00000002_len=21833_reads=37_covStat=35.63_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no
>tig00000003_len=34409_reads=65_covStat=62.81_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no
>tig00000004_len=22174_reads=41_covStat=25.69_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no
>tig00000006_len=25564_reads=29_covStat=52.66_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no
>tig00000008_len=25598_reads=42_covStat=40.10_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no
>tig00000010_len=63521_reads=127_covStat=122.90_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no
>tig00000011_len=24718_reads=25_covStat=45.70_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no
>tig00000012_len=31414_reads=39_covStat=70.95_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no
Hi JPascualAnaya,
are you still having that issue?
cheers,
dom
Hi Dom,
I ended up to extracting the reads and covStat info myself from the headers, but never got map2cov to work.
Juan
Hi there,
I am using blobtools with a PacBio-only assembly and I'm getting problems to get the coverage file from an alignment done with blasr (PB reads against assembly) in bam. map2cov makes a WARNING for each of the contigs in the assembly, complaining that they do not belong to such assembly, and finally no coverage info is written to the cov file. Any idea what the problem might be and how to solve this?
Software: bloobtools v0.9.19.3 blasr v5.3.8c16f52 samtools 1.3.1
Running on a Centos 6.8
First, I assembled a preliminary assembly with Canu, using PB subreads in fastq, then aligned the PB reads to this preliminary assembly using blasr, output in bam format. Commands:
This is the STDOUT I get:
And this is the output file: