Closed kweav closed 2 weeks ago
Shorter description of problem: Working on making the software robust to different data inputs: If there are replicates for the plasmid column, when looking at the distribution of log2 CPM values, would you want to find the filter based on the low log2 CPM looking at all pgRNA and replicates values at the same time or by looking at only averages of the pgRNA replicates?
From Alice in the meeting notes:
I think I would look at all pgRNAs at the replicate level
@cansavvy This no longer utilizes an average for finding the filter if multiple columns are selected. As discussed in our meeting, it pools the replicates, finds the cutoff, and then checks to see if any replicate is below that cutoff for each construct.
I also added some @importFrom
's and reran document
and load_all
. Not sure if all my importFrom statements are necessary, so let me know if I overdid adding them please
After I merged PR #34 (which was branch filter_qc_ki2
) and changed the base of this PR from filter_qc_ki2
to main, the pkgdown
check failed. I haven't been able to figure it out yet @cansavvy
In the logs, this is what I'm zeroing in on. I think we need to add the janitor
package to the DESCRIPTION file. We probably didn't use it before but clean_names()
is a handy function you are using here so its worth adding!
I'll add it now!
All set! Just a stray comma and a missing dependency.
You decide about the colnames thing but then we can merge this!
![]()
In the logs, this is what I'm zeroing in on. I think we need to add the
janitor
package to the DESCRIPTION file. We probably didn't use it before butclean_names()
is a handy function you are using here so its worth adding!I'll add it now!
Where in the logs was that? I had looked at this log and didn't see anything about janitor
, but instead:
✖ Failed to build gimap 0.1.0 (1.7s)
Error:
! error in pak subprocess
Caused by error in `stop_task_build(state, worker)`:
! Failed to build source package gimap.
---
Backtrace:
1. pak::lockfile_install(".github/pkg.lock")
2. pak:::remote(function(...) { …
3. err$throw(res$error)
---
This was the failed action log I was looking at -- https://github.com/FredHutch/gimap/actions/runs/9702648426/job/26779304887
I think I could have debugged the janitor
error message had I seen it, which is why I'm asking where I should have been looking since clearly I didn't look in the right place. I've even been relooking at this log this morning and still don't see it
All set! Just a stray comma and a missing dependency.
Sorry about the dependency issue. I ran document()
but didn't run usethis::use_package()
:(
You decide about the colnames thing but then we can merge this!
For the colnames
thing, that is how I tend to set colnames in general, so if consistency is good, I think I'll leave it if that's ok.
![]()
In the logs, this is what I'm zeroing in on. I think we need to add the
janitor
package to the DESCRIPTION file. We probably didn't use it before butclean_names()
is a handy function you are using here so its worth adding! I'll add it now!Where in the logs was that? I had looked at this log and didn't see anything about
janitor
, but instead:✖ Failed to build gimap 0.1.0 (1.7s) Error: ! error in pak subprocess Caused by error in `stop_task_build(state, worker)`: ! Failed to build source package gimap. --- Backtrace: 1. pak::lockfile_install(".github/pkg.lock") 2. pak:::remote(function(...) { … 3. err$throw(res$error) ---
This was the failed action log I was looking at -- https://github.com/FredHutch/gimap/actions/runs/9702648426/job/26779304887
I think I could have debugged the
janitor
error message had I seen it, which is why I'm asking where I should have been looking since clearly I didn't look in the right place. I've even been relooking at this log this morning and still don't see it
I should have looked at this log -- https://github.com/FredHutch/gimap/actions/runs/9702648423/job/26778978751, for an R-CMD-check like macOS, not pkgdown!
:warning: This is a stacked PR based on PR #34 (which is based on PR #33). :warning:
Specifically this PR is focused on implementing changes after code review for PR #33.
Given code review for #33, we wanted to start incorporating parameters/arguments for more robust sample/column selection during QC visualization and filtering.
I am envisioning 3 parameters here.
filter_zerocount_target_col
)filter_replicates_target_col
)filter_plasmid_target_col
)Changes made in this PR
lapply
, it's justplasmid_cpm_filter <- plasmid_data$plasmid_log2_cpm < cutoff
.01-qc.R
in the function definition and documentation/header..Rmd
file within01-qc.R
and add them asparams
within the template.Rmd
headerFocusing on the log2 CPM plasmid filter and visualizations ...
filter_plasmid_target_col
parameter to the relevant visualization function inqc-plots.R
. Because I wanted this to be more robust in not just selecting any column, but being able to select multiple columns if there are plasmid replicated, I added a check that if more than one column is selected, we create one large vector to visualize all of the pooled values and find a suggested cutoff from that.filter_plasmid_target_col
parameter to the relevant filter function in02-filter.R
. Because I wanted this to be more robust in not just selecting any column, but being able to select multiple columns if there are plasmid replicates, I added a check that if more than one column is selected, we create an average of those columns and apply the filter checking the cutoff value against those averages.As I'm writing this up, I realize that means there are two different cutoff values with the visualization method and the actual filter if there is more than one plasmid column selected. One cutoff is based off of the lower outlier of the pooled values and the other is based on the lower outlier of the averaged values (respectively). This definitely needs to be fixed. What would you suggest I use? It could be one of the following or something else I haven't thought of:
Requested Review Is this how you were envisioning incorporating a parameter and should I proceed with the other two, or are there changes/differences you'd like in the overall concept?
Thank you!!