ropensci / unconf14

Repo to brainstorm ideas (unconference style) for the rOpenSci hackathon.
28 stars 3 forks source link

Nicer interface for TCGA firehose_get command #14

Closed cadeParade closed 5 years ago

cadeParade commented 10 years ago

The Cancer Genome Atlas (TCGA) has tons of genetic data from tumor samples collected from patients as well as clinical information about each sample. Getting these files is a pretty painful ordeal and correlating the clinical data with the sample data takes a lot of time and R manipulation. There could definitely be a tool to automate this in a nice way.

karthik commented 10 years ago

Hi @cadeParade Would you like to join us?

cadeParade commented 10 years ago

Sure!

karthik commented 10 years ago

Awesome! Please add yourself to our participant page and I'll send out more details once we get closer.

sahilseth commented 9 years ago

Was wondering if a tool for released regarding the issue here ?

sckott commented 9 years ago

@cadeParade I don't think we worked on this at the hackathon. Did you work on this at all?

sahilseth commented 9 years ago

I made a package to get and parse data from https://browser.cghub.ucsc.edu (using cgquery's XMLs), but it would be nice to get and parse firehose data.

sckott commented 9 years ago

@sahilseth @cadeParade I just found https://wiki.nci.nih.gov/display/TCGA/TCGA+DCC+Web+Service+User%27s+Guide#TCGADCCWebServiceUser'sGuide-Access and associated pages

But is that not what you mean by firehose? Is this more what you mean? https://confluence.broadinstitute.org/display/GDAC/Download

sckott commented 9 years ago

here's the source of the firehose_get script https://gist.github.com/sckott/57ab7ebcf43ff248b624

cadeParade commented 9 years ago

Hi all, Regrettably I did not end up going to the hackathon and have not made aforementioned package :( I did start a project to do something similar, here is the repo: https://github.com/cadeParade/cancer_genome_api_utils It downloads and compiles data from the cBioPortal (which contains much of TCGA) API with wrappers that made more sense to me. There is some documentation on that page for how to use the functions contained in the files but it is not released as a package. I made the program to ease my own work compiling this sort of data, but please use it if it is helpful for you!