JuliaAI / Imbalance.jl

A Julia toolbox with resampling methods to correct for class imbalance.
https://juliaai.github.io/Imbalance.jl/dev/
MIT License
28 stars 1 forks source link

🚑 Fix scitypes for all existing methods #77

Closed EssamWisam closed 10 months ago

EssamWisam commented 10 months ago

Closes #76

codecov[bot] commented 10 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (60b4461) 95.22% compared to head (29d584e) 95.22%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## dev #77 +/- ## ======================================= Coverage 95.22% 95.22% ======================================= Files 43 43 Lines 900 900 ======================================= Hits 857 857 Misses 43 43 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

EssamWisam commented 10 months ago

@ablaom This maps the old setup into one where the scitypes are correct as needed for #76.

EssamWisam commented 10 months ago

Regarding the last commit, I noticed that a change I have done late in the development of the package to let random resampling operate on categorical types was not reflected on the scitypes used. The commit fix this. In this, I also assumed that if a matrix is AbstractMatrix{<:Real} then it also is AbstractMatrix{Infinite}

ablaom commented 10 months ago

The commit fix this. In this, I also assumed that if a matrix is AbstractMatrix{<:Real} then it also is AbstractMatrix{Infinite}

Strictly speaking this is not correct because Rational <: Real and AbstractIrrational <: Real but objects subtyping these have Unknown scitype. On reflection, we could perhaps have included them in the Continuous scitype. (We couldn't include all Real because Integer <: Real and Integer is for Count. )

EssamWisam commented 10 months ago

On reflection, we could perhaps have included them in the Continuous scitype. (We couldn't include all Real because Integer <: Real and Integer is for Count. )

Sure, I could do that but then a matrix of integers (corresponding to a purely categorical dataset) would work perfectly with the functional interface but not the MLJ interface; meanwhile, it's quite odd to have a dataset of irrational numbers or numbers taking the rational form a//b.

What do you think?