sissa-data-science / DADApy

Distance-based Analysis of DAta-manifolds in python
https://dadapy.readthedocs.io/
Apache License 2.0
99 stars 16 forks source link

Class imbalance support for return_label_overlap method #114

Closed alexserra98 closed 5 months ago

alexserra98 commented 5 months ago

Proposed changes

I've modified the method

.return_label_overlap()

from MetricComparisons() to support class imbalance.

Types of changes

The method now check for class imbalance and if True compute the overlap using a number $k$ of nearest neighbours that change as a function of number of elements per class. By default $k$ is set as $10$% of the class population but it can be modified with the argument k_per_classes.

Checklist

diegodoimo commented 5 months ago

Thank you for the improvement. I'll have a look at the code in the next few days.

codecov-commenter commented 5 months ago

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (5935c77) 79.10% compared to head (5c17eb5) 79.21%.

Files Patch % Lines
dadapy/metric_comparisons.py 94.11% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #114 +/- ## ========================================== + Coverage 79.10% 79.21% +0.11% ========================================== Files 15 15 Lines 2508 2536 +28 ========================================== + Hits 1984 2009 +25 - Misses 524 527 +3 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.