AlexsLemonade / OpenScPCA-analysis

An open, collaborative project to analyze data from the Single-cell Pediatric Cancer Atlas (ScPCA) Portal
Other
1 stars 8 forks source link

Break out commonly used functions in tumor cell validation #524

Closed allyhawkins closed 1 week ago

allyhawkins commented 2 weeks ago

Purpose/implementation Section

Please link to the GitHub issue that this pull request addresses.

Preparation for #480

What is the goal of this pull request?

As I was working on #480, I realized there are a few things that we are going to want to do in all of the notebooks that validate tumor cells and it might make sense to take that code out of the actual exploratory notebooks and put into functions that can be sourced in. I also thought it might be helpful to get these changes done separately from adding a whole new notebook so I'm doing this first.

Briefly describe the general approach you took to achieve this goal.

I took the heatmap functions that I created in #500 and added them to a new script that lives in scripts/utils. I also added in two functions for creating a classification data frame and a marker gene data frame. The classification df has one row per cell and contains all the classifications from all methods we used. The marker gene data frame is an expanded version with one row per marker gene per cell.

I then updated the existing notebook to use these functions and removed the code from that notebook.

I made a few additional modifications based on things I was working on for #480:

If known, do you anticipate filing additional pull requests to complete this analysis module?

Yes

Results

No results right now as this is just reorganizing some existing code.

Author checklists

Analysis module and review

Reproducibility checklist

allyhawkins commented 1 week ago

@jashapiro I believe I addressed all your clean up suggestions. I also updated the notebook to hide code so the html is much cleaner to look at.

And then I addressed a few minor spelling errors found in #525.