Open sjspielman opened 6 days ago
The overall run module shell script will also be modified so that you can indicate whether the module shoud be run in "benchmark" or "process" (?) mode, where the former will run benchmarking scripts/notebooks, and the latter will run doublet detection on ScPCA libraries.
I would probably recommend keeping this as two separate scripts. You can call both from the test action.
Noting that for multiplexed libraries, the script will need to take multiple samples into consideration: https://bioconductor.org/packages/release/bioc/vignettes/scDblFinder/inst/doc/scDblFinder.html#multiple-samples
Noting that for multiplexed libraries, the script will need to take multiple samples into consideration: https://bioconductor.org/packages/release/bioc/vignettes/scDblFinder/inst/doc/scDblFinder.html#multiple-samples
Given that we are no good at defining sample of origin for the current multiplexed samples in the portal, I would see what happens if we just run it straight.
From the link you posted:
If you have multiple samples (understood as different cell captures), then it is preferable to look for doublets separately for each sample (for multiplexed samples with cell hashes, this means for each batch).
My interpretation of this is that you should not indicate multiple samples and should not need to do anything differently. I believe a batch here refers to a single library that contains the multiplexed samples. A main source of doublets is when you go through the barcoding process during sequencing prep, which would be done on a library and not sample level. I would have this run on each library that we have.
If you are filing this issue based on a specific GitHub Discussion, please link to the relevant Discussion.
364
Describe the goals of the changes to the analysis module.
Our next step in the
doublet-detection
module will be run doublet detection on ScPCA data. Specifically, we have discussed runningscDblFinder
on ScPCA data with the following approach:What will your pull request contain?
Two scripts: the R script to run
scDblFinder
, and the shell script to call it.The overall run module shell script will also be modified so that you can indicate whether the module shoud be run in "benchmark" or "process" (?) mode, where the former will run benchmarking scripts/notebooks, and the latter will run doublet detection on ScPCA libraries.
Will you require additional software beyond what is already in the analysis module?
There will be no changes in dependencies.
Will you require different computational resources beyond what the analysis module already uses?
There will be no changes in compute - it can still be run on a laptop.
If known, when do you expect to file the pull request?
I expect to start this ~next sprint, which starts July 15th. So, we can expect a PR in probably 3ish weeks.~ towards the end of this sprint. We can expect a PR in 2ish weeks.