greenelab / pancancer

Building classifiers using cancer transcriptomes across 33 different cancer-types
BSD 3-Clause "New" or "Revised" License
119 stars 58 forks source link

When will be data availability #74

Closed bluesea2017 closed 6 years ago

bluesea2017 commented 6 years ago

Hello: We are very interest with the pancancer,but we can not get the data.

When will be data availability

We look forward to the data pubilic

Thank you

gwaybio commented 6 years ago

Hi @bluesea2017

The TCGA PanCanAtlas data is already publicly available and can be downloaded from UCSC Xena.

We are currently working on adding the data to this repository as well.

bluesea2017 commented 6 years ago

Hi @gwaygenomics These datas that we can not still download: pancan_mutation_freeze.tsv mutation_burden_freeze.tsv sample_freeze.tsv How can we download ?

gwaybio commented 6 years ago

Hi @bluesea2017

We've updated the repository in #75 to include all data that were used in our publications. pancancer_classifier.py can now be used. Note, however, that many additional scripts may not work since we compressed the files to store with git lfs (many scripts may expect different a different file extension). We are continuing to work to update them

Thanks! Greg

juechenyang commented 6 years ago

Hi @gwaygenomics

I am still confused about how to get this data:

mutation_burden_freeze.tsv

Could you help me solve this problem? Thank you

gwaybio commented 6 years ago

Thanks for your interest in the project @juechenyang

These data can be accessed after running initialize.sh. Specifically, it is generated in scripts/initialize/process_sample_freeze.py.

I think I could make this easier however - see #80

juechenyang commented 6 years ago

@gwaygenomics Thanks so much for your help!