BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
284 stars 109 forks source link

TCGAquery_clinic error #8

Closed needleworm closed 7 years ago

needleworm commented 8 years ago

Hi. for several days ago, TCGAquery doesn't work throwing error message as below:

clinical_brca_data <- TCGAquery_clinic("brca","clinical_patient") Error in fread(paste0(root, url, "/", files[grep("MANIFEST", files)]), : Expected sep (',') but new line or EOF ends field 1 on line 33 when reading data: --> In addition: Warning messages: 1: In fread(paste0(root, url, "/", files[grep("MANIFEST", files)]), : Unable to find 5 lines with expected number of columns (+ middle) 2: In fread(paste0(root, url, "/", files[grep("MANIFEST", files)]), : Unable to find 5 lines with expected number of columns (+ last)

What should I do to fix this problem?

tiagochst commented 8 years ago

This happens because of server's maintanance problem. Other packages will give you the same problem. There is actually no easy solution until the server is back to fix our package.

But, you can use RTCGA (I believe it has downloaded all the data and put it into Bioconductor) or RTCGAtoolbox (uses GDAC firehose as source) packages for the moment to get the old data.

rolfhaut commented 7 years ago

Hi,

Yeah it is really a pitty that the package is currently not working, especially as I just started to get a grasp of it (I'm actually very new to R). I contacted the NCBI site, but they weren't able to tell when they are back up again. However, they told me that the data portal on TCGA will go away in July and will be available through GDC. So I guess the code will have to be modified accordingly, and hope that you guys keep up the good work enabling non-experienced user like myself to perform analysis on TCGA data.

labrazil commented 7 years ago

Thank you for your interest in our package. Yes we are aware of this new change and we have ironed out a solution that will use GDC. Stay tuned, we hope to have something ready in the coming days.


\hn cell (USA): +1-310-570-2362 cell (BRA): +55-16-99779-2362

Sent from my nexus 6p On Jul 13, 2016 18:05, "rolfhaut" notifications@github.com wrote:

Hi,

Yeah it is really a pitty that the package is currently not working, especially as I just started to get a grasp of it (I'm actually very new to R). I contacted the NCBI site, but they weren't able to tell when they are back up again. However, they told me that the data portal on TCGA will go away in July and will be available through GDC. So I guess the code will have to be modified accordingly, and hope that you guys keep up the good work enabling non-experienced user like myself to perform analysis on TCGA data.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/BioinformaticsFMRP/TCGAbiolinks/issues/8#issuecomment-232486619, or mute the thread https://github.com/notifications/unsubscribe/ABBCt3eSPmdZAoVs_wWKYKXX0SUgQ-daks5qVVMxgaJpZM4JI-IV .

tiagochst commented 7 years ago

Our team updated TCGAbiolinks to work with the new GDC portal. We had to rewrite almost all the code again, but we hope this will help improve our tool which we are dedicated to improving. Unfortunately, the changes are not in the Bioconductor repository, but we anticipate the codes will be updated within the coming weeks. For the moment, if you require the new package, you can install it from our GitHub repository with the following command:

devtools::install_github(repo = "BioinformaticsFMRP/TCGAbiolinks") To get clinical data you have two options:

The first one will get the indexed clinical data, which is the same data if you download using "Download clinical" through GDC data portal. This function gives less information and can be retrieved in some minutes.

clin <- GDCquery_clinic("TCGA-BRCA", type = "clinical", save.csv = TRUE)
biospecimen <- GDCquery_clinic("TCGA-BRCA", type = "biospecimen", save.csv = TRUE)

The second option is to parse the clinical xml files, which will give you all the clinical information. But downloading all the xml files will take some time.

query <- GDCquery(project = "TCGA-BRCA",data.category = "Clinical")
GDCdownload(query)
clinical <- GDCprepare_clinic(query, clinical.info = "patient")
clinical.drug <- GDCprepare_clinic(query, clinical.info = "drug")
clinical.radiation <- GDCprepare_clinic(query, clinical.info = "radiation")
clinical.admin <- GDCprepare_clinic(query, clinical.info = "admin")
clinical.followup <- GDCprepare_clinic(query, clinical.info = "follow_up")