MontgomeryLab / tinyRNA

tinyRNA provides an all-in-one solution for precision analysis of sRNA-seq data. At the core of tinyRNA is a highly flexible counting utility, tiny-count, that allows for hierarchical assignment of reads to features based on positional information, extent of feature overlap, 5’ nucleotide, length, and strandedness.
GNU General Public License v3.0
1 stars 1 forks source link

Counter: additional interval selectors have been added #167

Closed AlexTate closed 2 years ago

AlexTate commented 2 years ago

Additional interval selectors have been added for Exact, 5' anchored`, and3' anchored`.

The names of the anchored selectors may lead to some confusion. Positions in GFF3 and SAM files are defined using the same axis and origin regardless of strand. The start position does not imply 5' position. Therefore strand is not considered when determining a match. An alignment matches a 5' anchored selector when its start position matches the feature. The same is true for the end position when using a 3' anchored selector. Turns out I was right: there was certainly room for confusion here (my own). See recent comments below.

Closes #164

AlexTate commented 2 years ago

Interval selectors are now objects which are stored in each identity match tuple of each feature record in the StepVector. Like the other selectors, candidate alignments are evaluated by calling the selector's contains() method. This leverages the bytecode advantage of using the in operator (COMPARE_OP) rather than an expensive function call (LOAD+CALL_METHOD or LOAD+CALL_FUNCTION).

taimontgomery commented 2 years ago

Need to revise it so that 5' and 3' anchor points account for the strand the small RNA is derived from as well as the strand the feature is derived from.

AlexTate commented 2 years ago

Strand semantics have been corrected for 3'/5' Anchored selectors 3' and 5' anchored

taimontgomery commented 2 years ago

very nice!