bartongroup / RATS

Relative Abundance of Transcripts: An R package for the detection of Differential Transcript isoform Usage.
MIT License
32 stars 1 forks source link

transcript ID cross-check #65

Closed fruce-ki closed 5 years ago

fruce-ki commented 5 years ago

See #64 , regarding the IDs in the provided look-up table not corresponding to the provided data.

fish4rodents() seems to try to shoe-horn input data onto the annotation, even when the IDs don't match. It inserts the IDs of the annotation as NA rows and keeps all the rows of the data not matching the IDs as well. As a result, the ID check at the beginning of calling DTU fails to detect the discrepancy and does not abort with an error as it should.

fruce-ki commented 5 years ago

Rather than changing fish4rodents() to add tests, I created a stronger cross-check that is called after munging the data into shape and order. This covers all the possible input sources.

The check explicitly compares the provided or computed counts table of each condition against the provided annotation, and requires that all target_ids of the count data are present in the annotation and vice versa. If there is even one unmatched ID, execution will abort. If reckless is enabled, a warning will still be shown.