Wrong sample counts in Data Sets overview

cBioPortal / cbioportal

cBioPortal for Cancer Genomics

https://cbioportal.org

GNU Affero General Public License v3.0

641 stars 497 forks source link

Wrong sample counts in Data Sets overview #543

Closed pieterlukasse closed 5 years ago

pieterlukasse commented 8 years ago

When there are multiple datasets of the same type (same genetic_profile.datatype) in a study, then the sample counts in the Data Sets overview UI (see screenshot) are possibly not correct.

pieterlukasse commented 8 years ago

overlaps with #244 and #249

pieterlukasse commented 8 years ago

Actually, dataset counts are problematic in general since they currently rely on data administrators to correctly set the stable_id of case lists as (not yet completely) documented here: https://github.com/cBioPortal/cbioportal/blob/rc/docs/File-Formats.md#case-lists

Part of the solution would maybe be to have a real link (referential constraints) between genetic_profileand sample_list tables. Another part of the solution is also found in the discussion on topic #494

pieterlukasse commented 5 years ago

@paulcwvandijk is this still an actual issue in your recent experience?

paulcwvandijk commented 5 years ago

@pieterlukasse I haven't come across this problem yet. I will re-open this issue if such a situation arises

pieterlukasse commented 5 years ago

@jjgao what about

Glioblastoma (TCGA, Nature 2008)
TCGA, Nature 2008 206 91 206 0

why does it report 0 samples for RNA-Seq? From the heatmap it looks like it has data:

jjgao commented 5 years ago

that's microarray data, not rnaseq.