Closed AlexTate closed 1 year ago
Since this PR introduces changes that are backward incompatible, I would like to make a release for the project in its current state before this one is merged.
With this new, much improved approach to classification, won't the class and rule plots always be the same? And thus can we get rid of the rules plots? Perhaps also change counts_by_rule.csv to counts_by_classification.csv, changing the Rule String column to Classification?
No, class and rule plots will differ if any rules share a Classify as...
value. Rule plots can be used in this case to see how much each rule contributed to the pooled classes. For this reason I think the proposed changes to output files would be incorrect
I see. In that case, perhaps we can add a counts_by_classification.csv table at some point.
Tested successfully with ram1 data.
The
Tag
column has been renamed toClassify as...
and will be used to apply a user-defined class to features that match the rule. TheClass=
attribute is no longer used to determine a feature's class. Tagged counting semantics still apply.The counts table produced by tiny-count therefore now has a multiindex of (Feature ID, Classifier). Backward compatibility is not offered for counts tables produced by an earlier version of tinyRNA. The Features Sheet is checked for the presence of a
Tag
column at pipeline/tiny-count startup and, if present, an error is produced along with steps to fix it.These changes opened the door for some very satisfying improvements to the code quality in plotter.py. Two additional parameters have been added to the pipeline/tiny-plot:
Classify as...
value. This is used in class_charts and scatter_dge_class.Closes #240