FredHutch / gimap

Genetic Interaction MAPping for dual target CRISPR screens
https://fredhutch.github.io/gimap/
0 stars 0 forks source link

update QC report with two graphs #33

Closed kweav closed 3 months ago

kweav commented 3 months ago

This PR adds two graphs to the QC report as requested in the May 20th meeting.

  1. Histogram on all data (no filters applied) that shows the distribution of the variance within replicates of each pgRNA.
  2. Bar plot specifically focusing on pgRNAs flagged by the zero count filter (working with subsets of data/filtered data), reporting the number of replicates (0,1,2, or 3) which have a zero count on the x-axis and the number of pgRNAs for each of those groups on the y-axis. (in my scratch code this was labeled as "How many day 22 pgRNAs have counts of 0 across x number of replicates")

Changes made in this PR to accomplish this goal:

  1. Added two functions to the R/qc-plots.R file. One for the histogram (qc_variance_hist()) and one for the bar plot (qc_constructs_countzero_bar()). Both use the gimap_dataset and assume that replicates are stored in columns 3-5.
  2. Called these functions within the inst/rmd/gimapQCTemplate.Rmd file (and added some appropriate headers)
  3. Added an R/filters.R file where possible filter functions will be stored. Currently the only filter in there is the zero count filter (qc_filter_zerocounts())
  4. Edited the descriptions in the vignette to include descriptions of these new plots

Open issues: You may notice that I removed the three dots for the heatmap plot when calling the function and within the function. When I tested my code locally and rendered the vignette to drive rendering the qc report, the heatmap was rendered within the vignette rather than the output qc report. So I was trying to troubleshoot that, but it's still an open issue that I wasn't able to resolve.

Next steps for upcoming stacked PRs:

Requested Review:

kweav commented 3 months ago

First off, thanks for your great PR description. Really appreciate the rundown.

I think overall this looks great. ❤️

Thanks!

For me to give the feedback on the requested review questions (also great love this ❤️), can you let me know what example code I should run to test this? i.e. what code have you been running when you developed this? If I start with what you've been using to develop it will help guide me through what to review.

The example code that I ran was rendering the vignette getting_started.Rmd

Requested Review:

  • Is the code for the new plots robust enough?

Looks good on a first pass, main question is 3 - 5 column

That's a question I have as well, sorry if I forgot to notate it.

  • How to fix the heatmap rendering in the wrong location I'll dig into this a bit on next round.

Thanks!

  • Is the new R/filters.R file an ok addition, or would you rather I put those functions elsewhere?

Can we add these to the 02-filter.R file?

Definitely can put them there!

cansavvy commented 3 months ago

@kweav I think all we need to do to merge this is:

  1. Switch out that line of code for the tidyverse version you have
  2. Make the argument that allows us to specify which columns to filter by. Default should be to use all columns (see above discussion we had to jog your memory if needed - I need to remind myself 😄 )
kweav commented 3 months ago

R/filters.R

1) I can push the tidyverse way to this branch 2) Will be in a later PR. I added an example one in PR #35

kweav commented 3 months ago

R/filters.R

  1. I can push the tidyverse way to this branch
  2. Will be in a later PR. I added an example one in PR Start with adding parameters that allow the user to select which column(s) are used in various filtering or filtering-related visualization steps #35

Used commit edb662a to push change 1

kweav commented 3 months ago

@cansavvy pkgdown is happy now!

cansavvy commented 3 months ago

@cansavvy pkgdown is happy now!

YAY!!!! 🎉