It will be very useful to add an option "--rpm" to allow users to calculate the Reads per million mapped reads

telatin / bamtocov

🏔 coverage extraction from BAM/CRAM files, supporting targets 📊

https://telatin.github.io/bamtocov/

MIT License

59 stars 6 forks source link

It will be very useful to add an option "--rpm" to allow users to calculate the Reads per million mapped reads #7

Open kerenzhou062 opened 2 years ago

telatin commented 2 years ago

Is this related to #6, and specifically, are you interested in read counts (counting the number of reads mapped) in both issues rather than nucleotide coverage?

kerenzhou062 commented 2 years ago

Is this related to #6, and specifically, are you interested in read counts (counting the number of reads mapped) in both issues rather than nucleotide coverage?

Hi telatin,

these are two independent issues. Both of them are preferred with read counts (per base is covered by how many reads). For example, if 3,000,000 reads were sequenced in total, chr1:123-123 covered by 3,00 reads, the RPM for this base will be 100.

Best, Keren

telatin commented 2 years ago

Ok, consider that bamtocov computes nucleotide coverage, while bamtocounts focuses on read counts on a target, and it has some options for normalization already built-in.

kerenzhou062 commented 2 years ago

Ok, consider that bamtocov computes nucleotide coverage, while bamtocounts focuses on read counts on a target, and it has some options for normalization already built-in.

Oh, I must misunderstand the read counts and nucleotide coverage. Both of these #6 and #7 are preferred with nucleotide coverage.

See the same example above, if 3,000,000 reads were sequenced in total, chr1:123-123 covered by 300 reads, then the nucleotide coverage and RPM for this base (chr1:123-123) will be 300 and 100.

Best,