BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
289 stars 110 forks source link

about the p-value in "TCGAanalyze_survival #117

Open xiexiaowei opened 7 years ago

xiexiaowei commented 7 years ago

Hi, thanks for your great work! It seems that "TCGAbiolinks" has been updataed, or something is wrong with my system. I have three questions:

  1. The p-value by "TCGAanalyze_survival" is based on logrank test?
  2. With regard to the same dataset, I get diifferent p-values from last year. So which is true? Should I discard the p-value I got last year and count again?
  3. I'm not so familar with R language. As for the survival figure, how to convert the current background(with grid) into the blank background? This the R code for survival figure: library(TCGAbiolinks) clin.test=read.csv("test.csv",sep=",",header=T) TCGAanalyze_survival(clin.test,clusterCol="group",risk.table=FALSE,conf.int=FALSE)

I'm looking forward to your reply! Thanks again very much!

tiagochst commented 7 years ago

Hello,

sorry for the late reply.

  1. There was only one change on the code that could change how the p-value was calculated. It was 7 months ago. This last version of the code is using the survminer package. If you access the documentation of the function ?ggsurvplot it says

By default survdiff is used to calculate regular log-rank test.

The original code was also using logrank test.

Unfortunately I'm not expert with survival analysis, the orignal code was the same used in this paper . But just to know, how different they are? Also, maybe the surviminer maintainers might be able to answer better. Specially if the version differences was less than 7 months.

Codes:

  1. I had to change the code. You might need to update the code.
    devtools::install_github("BioinformaticsFMRP/TCGAbiolinks")

    You can add a theme from ggthemes as follows:

    clin <- GDCquery_clinic("TCGA-LGG", type = "clinical", save.csv = FALSE)
    TCGAanalyze_survival(clin, 
                                      clusterCol="gender", 
                                      risk.table = FALSE,
                                      conf.int = FALSE, 
                                      ggtheme = ggthemes::theme_few())
DarioS commented 7 years ago

When did you run the previous analysis last year? In the middle of 2016, the TCGA dataset was reprocessed with different algorithms and a newer genome, hg38, and moved to the Genomic Data Commons. It's likely that you used data that was mapped to a different genome reference version (hg19) and was analysed with different algorithms (e.g. RSEM previously, HT-Seq currently) and it would therefore be expected that the survival results differ.