BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
286 stars 109 forks source link

Inconsistent vital_status on TCGA-PAAD #571

Closed fkgruber closed 1 year ago

fkgruber commented 1 year ago

Downloaded the TCGA-PAAD with TCGAbiolinks but found some inconsistencies in the definition of vital status. There are 3 different files contain a vital status information and they don't completely match: clinic_indexed.csv, clinic_followup.csv, and clinic_patient.csv. image

tiagochst commented 1 year ago

Hi,

Indexed (Vital_status_indexed) should be the same is the the latest follow up information, in case there is no follow up the initial vital status is used (Vital_status_patient)

The only concern there is the 3rd row (dead, alive, alive) with 4 samples.

A patient can have multiple follow ups visits, is the alive matching the latest one ?
Otherwise, we would need to check with GDC. Can you provide the 4 samples IDs ?

fkgruber commented 1 year ago

Sure here they are:

vital_status_indexed vital_status_patient vital_status_followup N Pats
Dead Alive Alive 4 TCGA-FB-A545, TCGA-IB-7891, TCGA-FB-A5VM, TCGA-F2-7273
tiagochst commented 1 year ago

The last follow up for TCGA-FB-A545 is indeed dead in the file nationwidechildrens.org_clinical_follow_up_v4.4_paad.txt

https://portal.gdc.cancer.gov/cases/b0cb81ad-3c20-4d56-ab7d-f64c0caee1ce

Screenshot 2023-04-14 at 4 52 23 PM
tiagochst commented 1 year ago

Can you send me the code you used to get the file clinic_followup.csv ?

fkgruber commented 1 year ago

sure

clinic <- GDCquery(project = "TCGA-PAAD", data.category = "Clinical", file.type = "xml")

GDCdownload(clinic, directory = paste0(fileDir, "raw"))

clinic.followup <- GDCprepare_clinic(clinic, "follow_up", directory = paste0(fileDir, "raw"))

 
tiagochst commented 1 year ago

Thanks. The clinic.followup data frame has the death information for those 4 cases

Screenshot 2023-04-17 at 10 54 07 AM
fkgruber commented 1 year ago

Got it thanks