Closed sfmig closed 4 weeks ago
Attention: Patch coverage is 95.23810%
with 1 line
in your changes missing coverage. Please review.
Project coverage is 47.75%. Comparing base (
a21d4f1
) to head (c678df6
).
Files with missing lines | Patch % | Lines |
---|---|---|
crabs/detector/datamodules.py | 75.00% | 1 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Other option, can we just use seed_everything?
Cool, I didn't know about this!
I think for now I'd prefer to constraint the seeding to the dataset creation, because that is the part I need to be reproducible. But good to have this in the radar.
thanks for the help Nik! 🌟
Why is this PR needed?
Currently when we create the test and validation splits we don't pass a generator. We do pass one when we create the training split.
This means that given a seed, the splitting of the dataset into
train
andtest-val
sets is reproducible, but the subsequent splitting of thetest-val
set into atest
set and aval
set is not.What does this PR do?
This PR:
Smaller bits
Notes
I decided to pass a different generator for each call to
random_split
to try to make it a bit "future-proof". That way we guarantee the splits are repeatable even if some randomisation code is added in between the two calls torandom_split
.