bailey-lab / MIPTools

A suite of computational tools used for molecular inversion probe design, data processing, and analysis.
https://miptools.readthedocs.io
MIT License
6 stars 9 forks source link

Add a population clustering fraction cutoff argument #39

Closed arisp99 closed 2 years ago

arisp99 commented 2 years ago

When running the wrangler app, the user is allowed to change the MIPWrangler script that is run to correct barcodes and cluster barcodes and MIPs together. Within bin[^1] we have provided the user with two related scripts:(https://github.com/bailey-lab/MIPTools/blob/4d87dd6a3bcde9c372423563ae53f9c148639951/bin/runMIPWranglerCurrent.sh and https://github.com/bailey-lab/MIPTools/blob/4d87dd6a3bcde9c372423563ae53f9c148639951/bin/runMIPWranglerNoCutoffCurrent.sh). The difference between these two scripts is that one is designed to implement a population clustering fraction cutoff, whereas the other does not. This parameter is controlled via the --fraccutoff argument. For example, in the case where we don't want a cutoff, we have:

https://github.com/bailey-lab/MIPTools/blob/4d87dd6a3bcde9c372423563ae53f9c148639951/bin/runMIPWranglerNoCutoffCurrent.sh#L10

In order to reduce the number of scripts the user chooses from, this PR adds an additional argument to the wrangler app, which controls the --fraccutoff argument. We have set the default value of this argument to the default value in MIPWrangler of 0.005. By doing so, we may remove the runMIPWranglerNoCutoffCurrent script and instead only have one script.

[^1]: We note that these scripts are also present within base_resources. Within this folder, there are also two scripts for running selective whole genome amplification (SWGA).