zavolanlab / htsinfer

Infer metadata for your downstream analysis straight from your RNA-seq data
Apache License 2.0
9 stars 22 forks source link

Expand inferred read length stats #132

Closed balajtimate closed 10 months ago

balajtimate commented 10 months ago

Is your feature request related to a problem? Please describe. In many cases, the reported read length in SRA does not explain the inferred read length results (e.g. SRR22397141, reported: 147 bp, inferred min: 35, max: 151). In PE samples, the reported length sometimes can be the sum of the maximum inferred length of forward and backward reads, but not always (SRR19776630, reported: 585 (292 + 293), inferred min: 35, max 301 for both reads.) In general, it would also be beneficial to get more info about the distribution of the read lengths in the sample (e.g. if the median is closer to the min, that could also raise some red flags in regards to the quality of the sequencing etc)

Describe the solution you'd like Add mean, median and mode of the read lengths to the reported library stats.