greenelab / pancancer

Building classifiers using cancer transcriptomes across 33 different cancer-types
BSD 3-Clause "New" or "Revised" License
118 stars 58 forks source link

missing file: mutation__gene_x_ccle_cellline.gct from ras_cell_line_predictions.ipynb #101

Open nvk747 opened 5 years ago

nvk747 commented 5 years ago

hi,
I was analyzing Ras_cell_line_predictions, the following file: mutation__gene_x_ccle_cellline.gct is missing from the data. checked the same in onco-gps-paper-analysis data folder [https://github.com/UCSD-CCAL/onco_gps_paper_analysis/tree/master/data], but could not find it. I have also looked for the file in CCLE datasets [https://portals.broadinstitute.org/ccle/data]. Let me know if any other place to download the same. regards, vijay

Refer to step:

Load CCLE Mutation Data

ccle_mut_file_name = os.path.join('..', '..', 'onco-gps-paper-analysis', 'data', 'mutation__gene_x_ccle_cellline.gct') ccle_all_mut_df = pd.read_table(ccle_mut_file_name, skiprows=2, index_col=0) ccle_all_mut_df.shape

gwaybio commented 4 years ago

Hi @nvk747 - sorry for the extremely late reply (nearly 1 year!) I am just seeing this now. Were you able to find the data? Still interested?

nvk747 commented 4 years ago

hi @gwaygenomics, Thanks for responding after a long time, I haven't got that file, however, I used alterative file for the same(CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct from CCLE website). But I am not sure that this file is appropriate for this purpose. In any case, if you have the following file: mutation__gene_x_ccle_cellline.gct please let me know.

regards vijay

gwaybio commented 4 years ago

Hi Vijay,

I realized the answer to your question can be found here: UCSD-CCAL/onco_gps_paper_analysis#7

From @KwatME:

mutation__gene_x_ccle_cellline.csv is inside of the zip file spro download gets. Running the same notebook 1 Set up data.ipynb should unzip it. I've attached the line that does the unzip below. Alternatively, if you want to download the zip file yourself and unzip it, here is the link.

Hope this helps!

KwatMDPhD commented 4 years ago

Thanks for the pin @gwaygenomics.

@nvk747 , using CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct is appropriate (this file is used not for the main modeling but for annotating the patterns found in the analysis.)

Let me know if you have any other questions.