Closed XianzheMa closed 1 week ago
( % to main) ( % to main)
Attention: Patch coverage is 97.36842%
with 2 lines
in your changes missing coverage. Please review.
Project coverage is 82.92%. Comparing base (
59ea026
) to head (2992df2
).:exclamation: Current head 2992df2 differs from pull request most recent head cddc9ea
Please upload reports for the commit cddc9ea to get more accurate results.
Files | Patch % | Lines |
---|---|---|
...ig/schema/pipeline/sampling/downsampling_config.py | 83.33% | 1 Missing :warning: |
...pling_strategies/rho_loss_downsampling_strategy.py | 87.50% | 1 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
This PR is the first PR to implement another way of producing holdout set, il model and irreducible loss (typically suitable for small datasets):
Our current architecture only allow one trigger id to correspond to one model id. To accommodate two il models within one trigger, I create a "twin model" which internally consists of two il models. During training, each il model will memorize the sample ids it has seen, so that during evaluation each il model will be used for the samples the model hasn't seen.
How it works
RHOLossDownsamplingStrategy
randomly samples half of the training set and mark theused
column inselector_state_metadata
table of those samples asTrue
. The strategy issues a request to train aRHOLOSSTwinModel
on this TSS. (unimplemented)RHOLOSSTwinModel
is instantiated. Only the 0th model is trained on this dataset (implemented in this PR).RHOLossDownsamplingStrategy
produces the other half of the training set by selecting the samples withused==False
. The strategy issues a request to finetune this twin model. (unimplemented)RHOLOSSTwinModel
is instantiated again. Only the 1th model is trained on this dataset (implemented in this PR).used
flags.Apparently it is not the optimal way to train a twin RHO model, but it's a very straightforward way and we can optimize it depending on how well it performs.
Current drawbacks
Due to
used
RHOLoss
will currently be not compatible with some presampling strategies that also useused
fields such asFreshnessSamplingStrategy
.Next PR
Implementing step 1 and 3: preparing the split holdout set.
How to review
All the main logic is in modyn/models/rho_loss_twin_model/rho_loss_twin_model.py