Closed kelseykeith closed 1 year ago
Data wrangling code will need to be updated once https://github.com/PediatricOpenTargets/ticket-tracker/issues/437 is completed and takes care of the duplicated CNV calls. This will also dictate which commit of OpenPedCan-analysis the CNV code will need to be developed under
Closing this ticket with the addition of draft code added in https://github.com/PediatricOpenTargets/OpenPedCan-api/commit/6aa2ddad65432f187e8b498c101c1251d648f951. There is also an example file attached that can be run with the data from OpenPedCan-analysis
to produce an example plot for the gene GPC2 in Neuroblastoma. (The code file is attached as a markdown file because GitHub doesn't support upload of Rmd in comments. A knitted pdf version is also attached for looking at the results.)
Draft plots for CNV visualizations to be added to the API were made, but the code needs to be transformed from the development code to final scripts that can make a plot for any cancer and gene, rather than the specific genes selected for demoing, as mentioned in PediatricOpenTargets/ticket-track #288. Starting with the simplest plot proposed, the simpler plot for the evidence page view to reduce complexity while learning how to integrated plots into the API. Example of the proposed plot:
The following two scripts need to be created, based off of the modified demo code:
consensus_wgs_plus_cnvkit_wxs.tsv
, plus any necessary additional metadata files likehistologies.tsv
, into a single CSV that can be used to make a plot for all combinations ofcancer_group
/EFO_ID
and Ensembl IDensg_id
.These scripts will most likely need to be modified as development to add this plot into the API continues.
Also will need to create a CNV API Design doc to share with stakeholders based on the document created for the TPM plots.