IHEC / ihec-assay-standards

This repo is for code and documentation associated with the ihec-assay-standards working group
Apache License 2.0
5 stars 5 forks source link

Standardization for RNA-Seq tracks #1

Closed sitag closed 4 years ago

sitag commented 9 years ago

Currently various centre are generating different types of tracks for RNA experiments. · Coverage · RPKM · …

The value of being able to visually interpret data at IHEC data grid is reduced by this. There are valid disagreements in how best to normalize RNA-Seq data. However, at the central resource where data from multiple centres can be browsed, the tracks should be standardized. Otherwise, data is hard to interpret.

To address this, consider that given single base pair resolution coverage tracks, exonic reads can be estimated. And as tracks are at single base pair resolution, a scaling based on this number is sufficient to get normalized tracks. No access to raw data is required.

There will still be underlying technical artifacts arising form different alignments etc, however, the data is now at least somewhat comparable.

Strawman Proposal: Standardized tool for normalizing minimally processed coverage tracks

A standardized tool can be developed to do this. It will require agreement on what gene annotation to use, and how to estimate exonic reads.

The idea is that each centre generates coverage tracks at single bp resolution, and the centre integrating all data sets normalizes each coverage track prior to visualization.