Closed jharenza closed 2 years ago
Attached in the zip file are the plots from reruning the oncoprint landscape
module. To generate these plots consensus_wgs_plus_cnvkit_wxs_autosomes.tsv.gz
and consensus_wgs_plus_cnvkit_wxs_x_and_y.tsv.gz
were used in place of consensus_seg_annotated_cn_autosomes.tsv.gz
and consensus_seg_annotated_cn_x_and_y.tsv.gz
. The percentage of alteration remained the same across all the plots. The module was run on EC2.
I am unable to run the module tp53_nf1_score
, as I run the module I within the docker environment, I get the following error message.:
/usr/local/lib/python3.5/dist-packages/rpy2/rinterface/__init__.py:145: RRuntimeWarning: Error: package or namespace load failed for ‘stats’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/usr/local/lib/R/library/stats/libs/stats.so':
libRlapack.so: cannot open shared object file: No such file or directory
warnings.warn(x, RRuntimeWarning)
/usr/local/lib/python3.5/dist-packages/rpy2/rinterface/__init__.py:145: RRuntimeWarning: During startup -
warnings.warn(x, RRuntimeWarning)
/usr/local/lib/python3.5/dist-packages/rpy2/rinterface/__init__.py:145: RRuntimeWarning: Warning message:
warnings.warn(x, RRuntimeWarning)
/usr/local/lib/python3.5/dist-packages/rpy2/rinterface/__init__.py:145: RRuntimeWarning: package ‘stats’ in options("defaultPackages") was not found
warnings.warn(x, RRuntimeWarning)
/usr/local/lib/python3.5/dist-packages/rpy2/robjects/pandas2ri.py:190: FutureWarning: from_items is deprecated. Please use DataFrame.from_dict(dict(items), ...) instead. DataFrame.from_dict(OrderedDict(items)) may be used to preserve the key order.
res = PandasDataFrame.from_items(items)
Traceback (most recent call last):
File "/home/rstudio/OpenPedCan-analysis/analyses/tp53_nf1_score/01-apply-classifier.py", line 67, in <module>
exprs_df = pandas2ri.ri2py(exprs_rds)
File "/usr/lib/python3.5/functools.py", line 745, in wrapper
return dispatch(args[0].__class__)(*args, **kw)
File "/usr/local/lib/python3.5/dist-packages/rpy2/robjects/pandas2ri.py", line 190, in ri2py_dataframe
res = PandasDataFrame.from_items(items)
File "/usr/local/lib/python3.5/dist-packages/pandas/core/frame.py", line 1883, in from_items
keys, values = zip(*items)
ValueError: not enough values to unpack (expected 2, got 0)
For step 1 of this ticket:
I subset the v11 file consensus_wgs_plus_cnvkit_wxs.tsv.gz
with the required cohorts using the histologies.tsv
file.
I further filtered it using for pathology_diagnosis for Neuroblastoma and related terms:
"Neuroblastoma"
"Ganglioneuroblastoma, intermixed"
"Ganglioneuroblastoma"
"Ganglioneuroma, maturing subtype OR Ganglioneuroblastoma, well differentiated"
The data was further filtered for the gene MYCN
Next part was to get the corresponding info from the PR, the files used were cnv_consensus.tsv
,consensus_seg_annotated_cn_autosomes.tsv.gz
. These PR files were joined with the histologies.tsv
file to get the pathology_diagnosis information. The PR data files followed a similar data filtering process as the v11 files.
The following plots were obtained for biospecimen that were common in the V11 and PR file.
This can be closed following exploration by @kelseykeith and @adilahiri showing that the updated consensus calls using GATK CNV instead of Manta SV do not change CN calls of oncogenes and molecular subtypes for NBL, HGG, LGG, and embryonal tumors
What analysis module should be updated and why?
We would like to determine what the differences are between using MantaSV and GATK CNV in the CNV consensus module. @sickler-alex has created two stacked PRs:
What changes need to be made? Please provide enough detail for another participant to make the update.
Some ideas for exploring the differences:
consensus_wgs_plus_cnvkit_wxs.tsv.gz
with cohorts mentioned above and compare results to those in 207. How much is the same, how much is new using GATK CNV, is there anything now missing?What input data should be used? Which data were used in the version being updated?
From 207 + 235:
Otherwise, use v11 OpenPedCan
When do you expect the revised analysis will be completed?
2 weeks
Who will complete the updated analysis?
@adilahiri