Syksy / curatedPCaData

Bioconductor R-package: Curated Prostate Cancer Data
Creative Commons Attribution 4.0 International
10 stars 4 forks source link

Double-checking ranks of newly normalized data #25

Closed Syksy closed 3 years ago

Syksy commented 3 years ago

Datasets that have been processed from raw data should be double-checked by cross-referencing the values provided e.g. by cBioportal; for example compare RMA-normalized Taylor et al. samples with the same samples from MSKCC dataset in cBio, or inside GEO the matrices produced from RMA normalization against the pre-normalized data. The values should correlate, otherwise there's been some systematic error in annotating sample names or in methodology.

Syksy commented 3 years ago

Going into version 0.7, all current datasets have been re-processed, and checked that they still maintained good correlation to genes in the original presented source (for example direct download from GEO).