fiberseq / fibertools-rs

Tools for fiberseq data written in rust.
https://fiberseq.github.io/fibertools/fibertools.html
42 stars 5 forks source link

Add a QC command that calculates basic statistics on a ft bam #59

Closed mrvollger closed 2 months ago

mrvollger commented 2 months ago

The idea is to create text histograms that can be used as inputs to plotting programs.

The current output idea is a text file that looks like:

statistic       value   count
fiber_lengths   1000     58
fiber_lengths   1001     100
...
read_quality   0.98   1001
...
m6a_ratio    0.06    1002
...

I'm adding a new stat that is the number of m6as in MSPs of specific sizes, whether that MSP is a FIRE element, and the number of times that was seen.

statistic       value   count
...
m6a_per_msp_size        1,1,false       4451629
m6a_per_msp_size        2,2,false       123829
m6a_per_msp_size        2,3,false       347430
m6a_per_msp_size        2,4,false       306585
m6a_per_msp_size        2,5,false       189577
m6a_per_msp_size        2,6,false       182797
m6a_per_msp_size        2,7,false       169225
...

This output is added when the option --m6a-per-msp is used.