Closed ejm714 closed 2 years ago
Merging #169 (ad38de6) into master (62797d9) will increase coverage by
0.0%
. The diff coverage is100.0%
.
@@ Coverage Diff @@
## master #169 +/- ##
======================================
Coverage 85.0% 85.0%
======================================
Files 30 30
Lines 1843 1851 +8
======================================
+ Hits 1567 1575 +8
Misses 276 276
Impacted Files | Coverage Δ | |
---|---|---|
zamba/models/model_manager.py | 84.1% <ø> (ø) |
|
zamba/models/config.py | 97.1% <100.0%> (+<0.1%) |
:arrow_up: |
@pjbull ready for another look. this now has the added benefit of doing split allocation within species which is something we had desired but hadn't yet implemented. this should help ensure better training results (in addition to fixing the bug)
@pjbull ready for reals this time!
This PR fixes the issue that caused the following error:
The issue has to do with the
split
column. When we don't provide it in the labels, we use a random split. But we don't guarantee that we have all of the values from the split_proportions dict (e.g.train
andval
) in the split column. With a small number of videos and the default proportions, all videos getting assigned to train (see insplits.csv
). Since there are no val videos, there is noval_f1_score
.I considered implementing this without the random draw and instead calculating the number of samples that need to go into each split based on the proportions provided. However, this gets tricky to ensure based on rounding. I instead went with the while loop that tries different seeds until it finds a draw that works. If someone specifies split proportions of
train: 100, val: 1
, and only provides say 5 videos, there is risk of an infinite loop. I'm not sure how important it is to catch such a situation since 1) we don't expect people to be training on so few videos and 2) it feels hard to accidentally end up in such a situation.Bonus fix: fix bug in ModelCheckpoint which lets monitor be None if there is no early stopping.
mode
must either bemin
ormax
. I set it tomin
to match PTL's default but it doesn't get used if monitor is None.Plus cleanup to set defaults in
DummyTrainConfig
to avoid repeated code.