Open sjspielman opened 3 weeks ago
We're back! I updated code throughout notebook in response to reviews, including using a 0.5
threshold for cxds
and adding more PCAs. While looking at the PCAs, it actually looked to me like it was cxds
that was capturing a lot of the consensus false negative droplets, and those points were missed by scDblFinder
and scrublet
. Therefore, for this first round returning back to you, I didn't do a re-analysis with just those two methods. Do you still think it's worth doing?
Edit: notebook for review convenience! 03_compare-benchmark-results.nb.html.zip
The next iteration has finally landed! I've incorporated the conceptual items brought up in review, and did some notebook rearrangement accordingly. Note that I do think this could be more modular since there is some repeated code between different types of consensus analyses (all 3 methods vs only 2 methods), but given where we anticipate this module headed overall, I wasn't sure that was really worth the effort.
Here's a rendered notebook: 03_compare-benchmark-results.nb.html.zip
One thought I had is that this module is an optional analysis module that can be used to run doublet detection using three different methods, but that's it. So contributors can have it if that's something they feel is necessary for their analysis, but we don't go beyond that.
This seems pretty reasonable to me actually, for the end-goal of the module to be a utility for folks to run these three methods on an SCE, which would include associated results and metadata.
Purpose/implementation Section
Please link to the GitHub issue that this pull request addresses.
446
What is the goal of this pull request?
This PR explores the overlap for each method's doublet calls for each dataset, and also assesses performance of a "consensus caller."
Briefly describe the general approach you took to achieve this goal.
I wrote a single notebook to process all datasets with three main analysis sections, in addition to a conclusions section at the end:
I also updated the overall module run script to render this notebook as the next step.
There are a few other changes here:
template-notebooks/02_explore-benchmark-results.Rmd
where I was using the wrong variable in some functionsIf known, do you anticipate filing additional pull requests to complete this analysis module?
Yep.
Results
What is the name of your results bucket on S3?
researcher-654654257431-us-east-2
What types of results does your code produce (e.g., table, figure)?
There are no additional result files, only the rendered notebook which contains all results from this analysis. I directly committed this notebook to the directory where I saved it in the module. Is this ok, or should I export it to results?
What is your summary of the results?
Doublet calls are not much in agreement, so consensus calls are small sets. The consensus calls do not appear to be the most accurate, either.
Provide directions for reviewers
What are the software and computational requirements needed to be able to run the code in this PR?
renv
environment needed to render this notebookAre there particularly areas you'd like reviewers to have a close look at?
I suppose I could add more analysis or interpretation to the notebook, but as there really isn't "much of a there there" to these results, as it were, I wasn't sure what else might be useful and informative to include. Do you have any ideas?
Is there anything that you want to discuss further?
-
Author checklists
Check all those that apply. Note that you may find it easier to check off these items after the pull request is actually filed.
Analysis module and review
README.md
has been updated to reflect code changes in this pull request.Reproducibility checklist
Dockerfile
.environment.yml
file.renv.lock
file.