FredHutch / gimap

This is under development R package for calculating genetic interactions
https://fredhutch.github.io/gimap/
0 stars 0 forks source link

Start with adding parameters that allow the user to select which column(s) are used in various filtering or filtering-related visualization steps #35

Closed kweav closed 2 weeks ago

kweav commented 1 month ago

:warning: This is a stacked PR based on PR #34 (which is based on PR #33). :warning:

Specifically this PR is focused on implementing changes after code review for PR #33.

Given code review for #33, we wanted to start incorporating parameters/arguments for more robust sample/column selection during QC visualization and filtering.

I am envisioning 3 parameters here.

  1. One is for the zero count filter that by default looks at every sample column. (filter_zerocount_target_col)
  2. Another is for the QC graphs (and potential future filter) that focus on replicate columns (visualizing variation within the replicates, or how many replicate samples have a raw count of 0). (filter_replicates_target_col)
  3. The third is for the low log2 CPM plasmid filter and visualizations. This PR focuses on adding, functionalizing, and documenting this parameter. (filter_plasmid_target_col)

Changes made in this PR

Requested Review Is this how you were envisioning incorporating a parameter and should I proceed with the other two, or are there changes/differences you'd like in the overall concept?

Thank you!!

kweav commented 3 weeks ago

Shorter description of problem: Working on making the software robust to different data inputs: If there are replicates for the plasmid column, when looking at the distribution of log2 CPM values, would you want to find the filter based on the low log2 CPM looking at all pgRNA and replicates values at the same time or by looking at only averages of the pgRNA replicates?

From Alice in the meeting notes:

I think I would look at all pgRNAs at the replicate level

kweav commented 3 weeks ago

@cansavvy This no longer utilizes an average for finding the filter if multiple columns are selected. As discussed in our meeting, it pools the replicates, finds the cutoff, and then checks to see if any replicate is below that cutoff for each construct.

I also added some @importFrom's and reran document and load_all. Not sure if all my importFrom statements are necessary, so let me know if I overdid adding them please

kweav commented 2 weeks ago

After I merged PR #34 (which was branch filter_qc_ki2) and changed the base of this PR from filter_qc_ki2 to main, the pkgdown check failed. I haven't been able to figure it out yet @cansavvy

cansavvy commented 2 weeks ago
Screenshot 2024-06-28 at 8 31 32 AM

In the logs, this is what I'm zeroing in on. I think we need to add the janitor package to the DESCRIPTION file. We probably didn't use it before but clean_names() is a handy function you are using here so its worth adding!

I'll add it now!

cansavvy commented 2 weeks ago

All set! Just a stray comma and a missing dependency.

You decide about the colnames thing but then we can merge this!

kweav commented 2 weeks ago
Screenshot 2024-06-28 at 8 31 32 AM

In the logs, this is what I'm zeroing in on. I think we need to add the janitor package to the DESCRIPTION file. We probably didn't use it before but clean_names() is a handy function you are using here so its worth adding!

I'll add it now!

Where in the logs was that? I had looked at this log and didn't see anything about janitor, but instead:

✖ Failed to build gimap 0.1.0 (1.7s)
  Error: 
  ! error in pak subprocess
  Caused by error in `stop_task_build(state, worker)`:
  ! Failed to build source package gimap.
  ---
  Backtrace:
  1. pak::lockfile_install(".github/pkg.lock")
  2. pak:::remote(function(...) { …
  3. err$throw(res$error)
  ---

This was the failed action log I was looking at -- https://github.com/FredHutch/gimap/actions/runs/9702648426/job/26779304887

I think I could have debugged the janitor error message had I seen it, which is why I'm asking where I should have been looking since clearly I didn't look in the right place. I've even been relooking at this log this morning and still don't see it

kweav commented 2 weeks ago

All set! Just a stray comma and a missing dependency.

Sorry about the dependency issue. I ran document() but didn't run usethis::use_package() :(

You decide about the colnames thing but then we can merge this!

For the colnames thing, that is how I tend to set colnames in general, so if consistency is good, I think I'll leave it if that's ok.

kweav commented 2 weeks ago
Screenshot 2024-06-28 at 8 31 32 AM

In the logs, this is what I'm zeroing in on. I think we need to add the janitor package to the DESCRIPTION file. We probably didn't use it before but clean_names() is a handy function you are using here so its worth adding! I'll add it now!

Where in the logs was that? I had looked at this log and didn't see anything about janitor, but instead:

✖ Failed to build gimap 0.1.0 (1.7s)
  Error: 
  ! error in pak subprocess
  Caused by error in `stop_task_build(state, worker)`:
  ! Failed to build source package gimap.
  ---
  Backtrace:
  1. pak::lockfile_install(".github/pkg.lock")
  2. pak:::remote(function(...) { …
  3. err$throw(res$error)
  ---

This was the failed action log I was looking at -- https://github.com/FredHutch/gimap/actions/runs/9702648426/job/26779304887

I think I could have debugged the janitor error message had I seen it, which is why I'm asking where I should have been looking since clearly I didn't look in the right place. I've even been relooking at this log this morning and still don't see it

I should have looked at this log -- https://github.com/FredHutch/gimap/actions/runs/9702648423/job/26778978751, for an R-CMD-check like macOS, not pkgdown!