openproblems-bio / openproblems

Formalizing and benchmarking open problems in single-cell genomics
MIT License
287 stars 76 forks source link

Fix DR baselines #816

Closed scottgigante-immunai closed 1 year ago

scottgigante-immunai commented 1 year ago

This PR fixes various problems with the dimensionality reduction baselines.

github-actions[bot] commented 1 year ago

Current build status

scottgigante-immunai commented 1 year ago

Closes https://github.com/openproblems-bio/openproblems/issues/803

codecov[bot] commented 1 year ago

Codecov Report

Base: 95.61% // Head: 95.68% // Increases project coverage by +0.07% :tada:

Coverage data is based on head (366bda8) compared to base (2da81a9). Patch coverage: 98.11% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #816 +/- ## ========================================== + Coverage 95.61% 95.68% +0.07% ========================================== Files 184 186 +2 Lines 4949 4984 +35 Branches 273 271 -2 ========================================== + Hits 4732 4769 +37 Misses 138 138 + Partials 79 77 -2 ``` | Flag | Coverage Δ | | |---|---|---| | unittests | `95.68% <98.11%> (+0.07%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/openproblems-bio/openproblems/pull/816?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio) | Coverage Δ | | |---|---|---| | [...tasks/dimensionality\_reduction/methods/baseline.py](https://codecov.io/gh/openproblems-bio/openproblems/pull/816?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#diff-b3BlbnByb2JsZW1zL3Rhc2tzL2RpbWVuc2lvbmFsaXR5X3JlZHVjdGlvbi9tZXRob2RzL2Jhc2VsaW5lLnB5) | `89.65% <85.71%> (-2.46%)` | :arrow_down: | | [...sks/dimensionality\_reduction/datasets/zebrafish.py](https://codecov.io/gh/openproblems-bio/openproblems/pull/816?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#diff-b3BlbnByb2JsZW1zL3Rhc2tzL2RpbWVuc2lvbmFsaXR5X3JlZHVjdGlvbi9kYXRhc2V0cy96ZWJyYWZpc2gucHk=) | `100.00% <100.00%> (ø)` | | | [.../dimensionality\_reduction/methods/diffusion\_map.py](https://codecov.io/gh/openproblems-bio/openproblems/pull/816?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#diff-b3BlbnByb2JsZW1zL3Rhc2tzL2RpbWVuc2lvbmFsaXR5X3JlZHVjdGlvbi9tZXRob2RzL2RpZmZ1c2lvbl9tYXAucHk=) | `100.00% <100.00%> (ø)` | | | [.../tasks/dimensionality\_reduction/metrics/density.py](https://codecov.io/gh/openproblems-bio/openproblems/pull/816?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#diff-b3BlbnByb2JsZW1zL3Rhc2tzL2RpbWVuc2lvbmFsaXR5X3JlZHVjdGlvbi9tZXRyaWNzL2RlbnNpdHkucHk=) | `100.00% <100.00%> (+4.65%)` | :arrow_up: | | [...ionality\_reduction/metrics/distance\_correlation.py](https://codecov.io/gh/openproblems-bio/openproblems/pull/816?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#diff-b3BlbnByb2JsZW1zL3Rhc2tzL2RpbWVuc2lvbmFsaXR5X3JlZHVjdGlvbi9tZXRyaWNzL2Rpc3RhbmNlX2NvcnJlbGF0aW9uLnB5) | `100.00% <100.00%> (ø)` | | | [test/test\_task\_dimensionality\_reduction.py](https://codecov.io/gh/openproblems-bio/openproblems/pull/816?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#diff-dGVzdC90ZXN0X3Rhc2tfZGltZW5zaW9uYWxpdHlfcmVkdWN0aW9uLnB5) | `100.00% <100.00%> (ø)` | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

scottgigante-immunai commented 1 year ago

Sure! Tl;dr diffusion distances are graph distances that are robust to noise. Where a graph shortest path distance is sensitive to a single noisy point short-circuiting the path between two otherwise distant points on the manifold, diffusion distances use the probability of traversing the graph from one point to another via a random walk to integrate over all graph paths instead of simply the shortest one.