openproblems-bio / openproblems

Formalizing and benchmarking open problems in single-cell genomics
MIT License
290 stars 77 forks source link

cell2location with max normalisation flexibility #704

Closed vitkl closed 1 year ago

vitkl commented 1 year ago

I think this is necessary to properly analyse the data where total UMI is completely decoupled from biological RNA count https://github.com/openproblems-bio/openproblems/issues/589#issuecomment-1325831148

Submission type

Testing

Submission guidelines

PR review checklist

This PR will be evaluated on the basis of the following checks:

codecov[bot] commented 1 year ago

Codecov Report

Base: 95.06% // Head: 95.06% // Increases project coverage by +0.00% :tada:

Coverage data is based on head (be78846) compared to base (a796e02). Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #704 +/- ## ======================================= Coverage 95.06% 95.06% ======================================= Files 154 154 Lines 4072 4075 +3 Branches 206 206 ======================================= + Hits 3871 3874 +3 Misses 131 131 Partials 70 70 ``` | Flag | Coverage Δ | | |---|---|---| | unittests | `95.06% <100.00%> (+<0.01%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/openproblems-bio/openproblems/pull/704?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio) | Coverage Δ | | |---|---|---| | [...ems/tasks/denoising/datasets/tabula\_muris\_senis.py](https://codecov.io/gh/openproblems-bio/openproblems/pull/704/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#diff-b3BlbnByb2JsZW1zL3Rhc2tzL2Rlbm9pc2luZy9kYXRhc2V0cy90YWJ1bGFfbXVyaXNfc2VuaXMucHk=) | `100.00% <ø> (ø)` | | | [...sks/spatial\_decomposition/methods/cell2location.py](https://codecov.io/gh/openproblems-bio/openproblems/pull/704/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#diff-b3BlbnByb2JsZW1zL3Rhc2tzL3NwYXRpYWxfZGVjb21wb3NpdGlvbi9tZXRob2RzL2NlbGwybG9jYXRpb24ucHk=) | `96.77% <100.00%> (+0.16%)` | :arrow_up: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

vitkl commented 1 year ago

Would be great if you add this @scottgigante-immunai

vitkl commented 1 year ago

We normally don't recommend using fully flexible normalisation (detection_alpha=1) because it completely removes the relationship between estimated cell abundance and the number of cells you see in the image. I analysed a number of datasets, such as https://www.biorxiv.org/content/10.1101/2021.11.26.470108v1, where over-normalisation leads to the spurious mapping of a subset of cell types to low UMI count regions.

image image

However, the simulation in this project are designed to require fully flexible normalisation.

scottgigante-immunai commented 1 year ago

@vitkl looks like you haven't added it to methods/__init__.py, so this new method is not run in tests.

scottgigante-immunai commented 1 year ago

Also please open PRs as drafts until tests pass.

Nextflow test pipeline is passing on this base branch of this pull request (include link to passed test on NF Tower found in GitHub Actions summary: ) If this pull request is not ready for review (including passing the Nextflow test pipeline), I will open this PR as a draft (click on the down arrow next to the "Create Pull Request" button)

vitkl commented 1 year ago

@scottgigante-immunai I don't understand why the test fail (https://github.com/vitkl/openproblems/actions/runs/3556951870/jobs/5974765854) hence I don't create these as a draft PR

scottgigante-immunai commented 1 year ago

That one was my fault. Should be fixed now

scottgigante-immunai commented 1 year ago

Tests passing at https://tower.nf/orgs/openproblems-bio/workspaces/openproblems-bio/watch/3reAegSoEpYtFz !

vitkl commented 1 year ago

The tests seem to pass on my branch https://github.com/vitkl/openproblems/actions/runs/3560607225

scottgigante-immunai commented 1 year ago

Thanks @vitkl !