Please link to the GitHub issue that this pull request addresses.
Preparation for #480
What is the goal of this pull request?
As I was working on #480, I realized there are a few things that we are going to want to do in all of the notebooks that validate tumor cells and it might make sense to take that code out of the actual exploratory notebooks and put into functions that can be sourced in. I also thought it might be helpful to get these changes done separately from adding a whole new notebook so I'm doing this first.
Briefly describe the general approach you took to achieve this goal.
I took the heatmap functions that I created in #500 and added them to a new script that lives in scripts/utils. I also added in two functions for creating a classification data frame and a marker gene data frame. The classification df has one row per cell and contains all the classifications from all methods we used. The marker gene data frame is an expanded version with one row per marker gene per cell.
I then updated the existing notebook to use these functions and removed the code from that notebook.
I made a few additional modifications based on things I was working on for #480:
For CopyKAT I now am including the mean cnv detection for all chromosomes along with the predictions in the predictions output. That way we can create plots with it when we read in the CopyKAT output.
For InferCNV I am including the scaled_mean_proportion along with the predictions output. Again, this is so we can use this information in the plots used for validation.
If known, do you anticipate filing additional pull requests to complete this analysis module?
Yes
Results
No results right now as this is just reorganizing some existing code.
Purpose/implementation Section
Please link to the GitHub issue that this pull request addresses.
Preparation for #480
What is the goal of this pull request?
As I was working on #480, I realized there are a few things that we are going to want to do in all of the notebooks that validate tumor cells and it might make sense to take that code out of the actual exploratory notebooks and put into functions that can be sourced in. I also thought it might be helpful to get these changes done separately from adding a whole new notebook so I'm doing this first.
Briefly describe the general approach you took to achieve this goal.
I took the heatmap functions that I created in #500 and added them to a new script that lives in
scripts/utils
. I also added in two functions for creating a classification data frame and a marker gene data frame. The classification df has one row per cell and contains all the classifications from all methods we used. The marker gene data frame is an expanded version with one row per marker gene per cell.I then updated the existing notebook to use these functions and removed the code from that notebook.
I made a few additional modifications based on things I was working on for #480:
CopyKAT
I now am including the mean cnv detection for all chromosomes along with the predictions in the predictions output. That way we can create plots with it when we read in theCopyKAT
output.InferCNV
I am including thescaled_mean_proportion
along with the predictions output. Again, this is so we can use this information in the plots used for validation.If known, do you anticipate filing additional pull requests to complete this analysis module?
Yes
Results
No results right now as this is just reorganizing some existing code.
Author checklists
Analysis module and review
README.md
has been updated to reflect code changes in this pull request.Reproducibility checklist
Dockerfile
.environment.yml
file.renv.lock
file.