JuliaStats / MLBase.jl

A set of functions to support the development of machine learning algorithms
MIT License
186 stars 63 forks source link

Control randomness of Kfold,StratifiedKfold,RandomSub #57

Closed ngiann closed 1 year ago

ngiann commented 1 year ago

I modified the code so that the randomness of Kfold, StratifiedKfold and RandomSub can be controlled.

This is done by passing an optional object of type AbstractRNG via the new argument rng.

The default value of the new argument rng is MersenneTwister().

Example: Kfold(10,2, MersenneTwister(1))

I did not modify StratifiedRandomSub as I don't understand where random indices are samples.

I also did not modify LOOCV as there seems to be naturally no need for it.

Fixes #56

codecov-commenter commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.03 :warning:

Comparison is base (25f578b) 78.77% compared to head (ff41fb8) 78.75%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #57 +/- ## ========================================== - Coverage 78.77% 78.75% -0.03% ========================================== Files 6 7 +1 Lines 655 659 +4 ========================================== + Hits 516 519 +3 - Misses 139 140 +1 ``` | [Impacted Files](https://codecov.io/gh/JuliaStats/MLBase.jl/pull/57?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats) | Coverage Δ | | |---|---|---| | [src/crossval.jl](https://codecov.io/gh/JuliaStats/MLBase.jl/pull/57?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats#diff-c3JjL2Nyb3NzdmFsLmps) | `98.93% <100.00%> (+0.03%)` | :arrow_up: | ... and [2 files with indirect coverage changes](https://codecov.io/gh/JuliaStats/MLBase.jl/pull/57/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats) Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

AsafManela commented 1 year ago

Thanks. Looks great. Maybe add some tests to test/crossval.jl that provide an rng?

ngiann commented 1 year ago

Good idea, I will also add a test then.

ngiann commented 1 year ago

Looks great. Thanks taking this on!

Thank you for your input. I will update the docs soon.

AsafManela commented 1 year ago

@ararslan Can you please merge and release this PR?

ngiann commented 1 year ago

closed this by accident

ngiann commented 1 year ago

Sorry that this took far too long. I incorporated the above discussed changes and updated the tests accordingly. In hindsight, I see now that I should have committed the updated code and the corresponding tests in a single commit. I did it separately, and I think that this is why the commit above failed: the updated code was automatically tested using the old test code and I provided the new test code in a separate, subsequent commit. I hope what I wrote make sense. Sorry for the confusion!

Also: I updated the documentation to reflect the changes in these functions. Please see #58.

ngiann commented 1 year ago

Great work here, thanks for your patience and persistence!

Thanks for your guidance!