Pushed the scripts I used for plotting clone density plots and clustered SNVs histograms
Density curves for each clone in a sample to show CCF distribution of SNVs per clone
Histogram of clustered SNVs in a sample to show CCF distribution of SNVs per sample
I have not yet overlayed the density curves on the histogram, as the y-axis range is quite different between the two, and I wanted to get a second opinion before scaling the density (or histogram) values to match
If we decide to use these, they need to be standardized (e.g. column names from SRC tool's results data table) and integrated.
Checklist
[X] This PR does NOT contain Protected Health Information (PHI). A repo may need to be deleted if such data is uploaded. Disclosing PHI is a major problem[^1] - Even a small leak can be costly[^2].
[X] This PR does NOT contain germline genetic data[^3], RNA-Seq, DNA methylation, microbiome or other molecular data[^4].
[X] This PR does NOT contain other non-plain text files, such as: compressed files, images (e.g..png, .jpeg), .pdf, .RData, .xlsx, .doc, .ppt, or other output files.
To automatically exclude such files using a .gitignore file, see here for example.
[X] I have set up or verified the main branch protection rule following the github standards before opening this pull request.
[X] The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
[ ] I have added the major changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.
Description
Pushed the scripts I used for plotting clone density plots and clustered SNVs histograms
If we decide to use these, they need to be standardized (e.g. column names from SRC tool's results data table) and integrated.
Checklist
[X] This PR does NOT contain Protected Health Information (PHI). A repo may need to be deleted if such data is uploaded.
Disclosing PHI is a major problem[^1] - Even a small leak can be costly[^2].
[X] This PR does NOT contain germline genetic data[^3], RNA-Seq, DNA methylation, microbiome or other molecular data[^4].
[^1]: UCLA Health reaches $7.5m settlement over 2015 breach of 4.5m patient records [^2]: The average healthcare data breach costs $2.2 million, despite the majority of breaches releasing fewer than 500 records. [^3]: Genetic information is considered PHI. Forensic assays can identify patients with as few as 21 SNPs [^4]: RNA-Seq, DNA methylation, microbiome, or other molecular data can be used to predict genotypes (PHI) and reveal a patient's identity.
.png
, .jpeg
),.pdf
,.RData
,.xlsx
,.doc
,.ppt
, or other output files.To automatically exclude such files using a .gitignore file, see here for example.
[X] I have read the code review guidelines and the code review best practice on GitHub check-list.
[X] I have set up or verified the
main
branch protection rule following the github standards before opening this pull request.[X] The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
[ ] I have added the major changes included in this pull request to the
CHANGELOG.md
under the next release version or unreleased, and updated the date.