theislab / scib

Benchmarking analysis of data integration tools
MIT License
294 stars 63 forks source link

Add documentation for metrics preprocessing #312

Closed mumichae closed 1 year ago

mumichae commented 2 years ago

https://scib.readthedocs.io/en/doc-metrics_input/

codecov[bot] commented 2 years ago

Codecov Report

Merging #312 (8578001) into main (2fe05c7) will increase coverage by 0.62%. The diff coverage is 91.83%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #312 +/- ## ========================================== + Coverage 59.53% 60.16% +0.62% ========================================== Files 39 40 +1 Lines 2123 2149 +26 ========================================== + Hits 1264 1293 +29 + Misses 859 856 -3 ``` | Flag | Coverage Δ | | |---|---|---| | unittest | `60.16% <91.83%> (+0.62%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/theislab/scib/pull/312?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab) | Coverage Δ | | |---|---|---| | [scib/metrics/cell\_cycle.py](https://codecov.io/gh/theislab/scib/pull/312/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab#diff-c2NpYi9tZXRyaWNzL2NlbGxfY3ljbGUucHk=) | `89.36% <ø> (ø)` | | | [scib/metrics/graph\_connectivity.py](https://codecov.io/gh/theislab/scib/pull/312/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab#diff-c2NpYi9tZXRyaWNzL2dyYXBoX2Nvbm5lY3Rpdml0eS5weQ==) | `92.30% <ø> (ø)` | | | [scib/metrics/trajectory.py](https://codecov.io/gh/theislab/scib/pull/312/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab#diff-c2NpYi9tZXRyaWNzL3RyYWplY3RvcnkucHk=) | `91.07% <ø> (ø)` | | | [scib/preprocessing.py](https://codecov.io/gh/theislab/scib/pull/312/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab#diff-c2NpYi9wcmVwcm9jZXNzaW5nLnB5) | `18.95% <ø> (ø)` | | | [tests/metrics/rpy2/test\_kbet.py](https://codecov.io/gh/theislab/scib/pull/312/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab#diff-dGVzdHMvbWV0cmljcy9ycHkyL3Rlc3Rfa2JldC5weQ==) | `100.00% <ø> (ø)` | | | [tests/metrics/test\_silhouette\_metrics.py](https://codecov.io/gh/theislab/scib/pull/312/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab#diff-dGVzdHMvbWV0cmljcy90ZXN0X3NpbGhvdWV0dGVfbWV0cmljcy5weQ==) | `100.00% <ø> (ø)` | | | [scib/metrics/nmi.py](https://codecov.io/gh/theislab/scib/pull/312/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab#diff-c2NpYi9tZXRyaWNzL25taS5weQ==) | `27.53% <66.66%> (+2.16%)` | :arrow_up: | | [scib/metrics/lisi.py](https://codecov.io/gh/theislab/scib/pull/312/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab#diff-c2NpYi9tZXRyaWNzL2xpc2kucHk=) | `42.13% <90.00%> (+0.31%)` | :arrow_up: | | [scib/metrics/ari.py](https://codecov.io/gh/theislab/scib/pull/312/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab#diff-c2NpYi9tZXRyaWNzL2FyaS5weQ==) | `92.59% <90.90%> (+0.59%)` | :arrow_up: | | [scib/metrics/isolated\_labels.py](https://codecov.io/gh/theislab/scib/pull/312/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab#diff-c2NpYi9tZXRyaWNzL2lzb2xhdGVkX2xhYmVscy5weQ==) | `94.33% <96.00%> (+0.33%)` | :arrow_up: | | ... and [11 more](https://codecov.io/gh/theislab/scib/pull/312/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theislab) | |
LuckyMD commented 2 years ago

A couple of questions for the image overview for metrics inputs: 1 .How are the upper and lower lists of metrics different?

  1. Isn't clustering performed by the ARI and NMI metrics? Why is it a necessary input?

Overall the diagram isn't super clear or particularly explained in the text.

mumichae commented 2 years ago
  1. Metrics in the upper half are run because on the metrics output directly in contrast to metrics in the lower half that require preprocessing. E.g. ASW should be be used with PCA for feature output or with the integrated embedding for embedding output.
  2. Clustering is not part of the ARI or NMI functions, so it is a necessary preprocessing step

I'll address some more explanation of the figure in the text.