how to train multiple instances of the same model

online-ml / beaver

🦫 MLOps for (online) machine learning

BSD 3-Clause "New" or "Revised" License

80 stars 13 forks source link

how to train multiple instances of the same model #8

Closed dberardo closed 2 years ago

dberardo commented 2 years ago

this is perhaps more of a conceptual question then a MLops-related one, so it could be moved to RiverML.

what is the best approach to train an online learning model which is running in multiple parallel instances i am thinking for example at very high frequency applications where streaming data is partitioned over different model instances to achieve higher throughput or better load balancing.

the question here is: since every instance of the model will look at just a portion of the data, how should the trained models params be aggregated to get a unique model?

is this an anti-pattern or should one keep just the model params that perform better (a sort of model selection)?

is beaver addressing this issue too?

MaxHalford commented 2 years ago

Hey man.

You're describing federated learning. The short answer is that we don't cover this, so no you can't distribute the training of a model.

If you take a deep look at River, you'll see that Mean and Var are mergeable:

from river import stats

stats.Mean().update(1).update(2) + stats.Mean().update(3).update(4)

That's the lowest building block of federated learning: merging statistics.

The trick is that the merging is dependent on the underlying algorithm. You can write a generic function for this. I'm not closed to having merge-able models in River, I just wouldn't know where to start. Maybe that decision trees could be merged, I don't know. It sounds a bit like science fiction to me as of now, but who knows.

dberardo commented 2 years ago

thanks for the reply, i just wanted to go sure that this matter was indeed "a conceptual issue" and not something i might have missed along the way xD

i will perhaps implement my own strategy to pick the best model to store based on some kind of rolling metrics. but this will be done when needed.

i think however, that model federation could belong to beaver (when ready), since a centralized service where the merging happens needs to exist.