unmapped vs mapped in covplot

DRL / blobtools

Modular command-line solution for visualisation, quality control and taxonomic partitioning of genome datasets

GNU General Public License v3.0

192 stars 44 forks source link

unmapped vs mapped in covplot #117

Open shoyosato opened 3 years ago

shoyosato commented 3 years ago

I produced a covplot for my data and it shows 51.57% unmapped reads. This number does not match the mapped reads info from my .cov or .json. Are these supposed to be congruent? I was expecting the no-hits blobs to be larger in the blobplot to assess where my organism's coverage/gc content fall. I am wondering if this 51.57% should be part of the no-hits blobs. Thank you!

here is my covplot:

epibro_blob blobDB json bestsum phylum p8 span 100 blobplot read_cov bam0

Here are the first four lines from my .cov file:

## 1.1.1
## Total Reads = 22276300
## Mapped Reads = 22265834
## Unmapped Reads = 10466

and some info from the end of the .json:

"reads_total": 22276300, "reads_mapped": 22265834, "reads_unmapped": 0,

dtusso2020 commented 2 years ago

Hello shoyosato

Did you solve your problem? The same happens to me.

shoyosato commented 2 years ago

Hey Diana!

Ya, it seemed to be an issue with long reads for me. I originally mapped the long reads back to the LR assembly to calculate coverage. I reran with short read data and the mapped bar bumped up to 98.6%. Sorry that the fix didn't really solve the root of problem....

mrmrwinter commented 2 years ago

I'm also having this issue, also when using long reads. Is there a way to change the mapping module to minimap2, vulcan, or something similar?

In the meantime I will try fragmenting the long reads and seeing if they map better

DRL commented 2 years ago

Blobtools only parses the BAM file ... see here for a approximate description how it works (actually done via pysam now, but should be the same filters than samtools output)

For those people seeing weird things, check how your mapper made the alignments. Most likely there are multiple alignments or weird SAM flags for a given long read which then inflates numbers weirdly.

Sabrin2020 commented 2 years ago

How to do the same plots please with blobtools2?

magrgic commented 8 months ago

Hello, I have a similar issue when using long reads; did anyone manage to get around it?