FredHutch / gimap

Genetic Interaction MAPping for dual target CRISPR screens
https://fredhutch.github.io/gimap/
0 stars 0 forks source link

Workflow Functionalizing #2

Open cansavvy opened 8 months ago

cansavvy commented 8 months ago
### Issue Description First we need to functionalize all of the steps from https://github.com/FredHutch/GI_mapping/tree/main/workflow/scripts - [ ] [pgRNA_counts_QC.Rmd](https://github.com/FredHutch/GI_mapping/blob/main/workflow/scripts/pgRNA_counts_QC.Rmd) #3 - [ ] [filter_and_calculate_LFC.Rmd](https://github.com/FredHutch/GI_mapping/blob/main/workflow/scripts/filter_and_calculate_LFC.Rmd) #4 - [ ] [get_pgRNA_annotations.Rmd](https://github.com/FredHutch/GI_mapping/blob/main/workflow/scripts/get_pgRNA_annotations.Rmd) #5 - [ ] [calculate_GI_scores.Rmd](https://github.com/FredHutch/GI_mapping/blob/main/workflow/scripts/calculate_GI_scores.Rmd) #6 ## Input data We will be building the pipeline to work on this dataset first: [PP_pgPEN_HeLa_counts.txt](https://github.com/FredHutch/gimap/files/13891924/PP_pgPEN_HeLa_counts.txt) But note that later steps in the pipeline will need a version of this data that is processed up to that point in the pipeline. So we will need to work on these functions sequentially. ## General approach - Take each Rmd linked above, and try to make functions to automate as much of the steps as possible. - Add software dependency items to the Docker image as they are needed. ## Additional goals as we are functionalizing. - Minimize software dependencies as much as possible - Look for places where customizations and options may be needed this includes but are not limited to: - parameter changes - different data structures - different annotations needed