Update description - Githubissues

vsoch commented 2 years ago

hey apollo team! I think it would be good to come up with a slightly tweaked description for the project, because it technically isn't online (incremental) learning. I was chatting in Mattermost with @davidbeckingsale about it. I don't know the project well enough to make a suggestion - but is it some kind of batch ML server perhaps?

ggeorgakoudis commented 2 years ago

@vsoch @davidbeckingsale Can you elaborate a bit on your discussion? Happy to discuss also over a meeting

vsoch commented 2 years ago

Sure! So online machine learning really means incremental learning, and algorithms that can train one example at a time and even forget over time. It is not batch learning, and it is not "something with machine learning put online." I'm not familiar with Apollo in detail but it looks more like parameter turning for batch learning (more traditional) so I think the description should reflect that so it's not misleading, and I want to suggest that we tweak the description.

An example library that does online-ml: https://github.com/online-ml/river#-philosophy And a recent talk I attended: https://maxhalford.github.io/slides/online-ml-in-practice-pydata-pdx.pdf

ggeorgakoudis commented 2 years ago

Thank you for the detailed description and the very helpful pointers! Given my limited expertise on ML I'll need your help to see whether Apollo fits the description. So, Apollo trains DecisionTree, RandomForest classifier models using a "batch" of training data (tuples of features, policy applied, execution time) collected at runtime using an exploration strategy. Apollo has two user-selectable modes for those tree-based classifiers:

It does not re-train a model (simple but practical)
It does re-training, in this model Apollo tracks the execution time and "forgets" the learned model when it deviates from the expected one. Then Apollo goes through another round of exploration to collect a batch of data and re-train. This can occur many times during the execution of the tuned program

Besides those tree-based classifiers, Apollo also implements a reinforced learning policy network for policy selection, rewarding minimum execution time policies. Since the output of the policy network is a distribution on the available policies (uniform at initialization) it does not need an explicit exploration phase. This network is trained continuously, at runtime, as measurement data as collected to tip the distribution.

What do you think?

vsoch commented 2 years ago

That is interesting - (and I'm relatively new to online ML too so I'm still sort of mapping the space for myself) - the first steps with batches sound more like traditional ML, but the last bit does sound a bit like Online ML. @maxhalford what would you say (he's been working in the space for quite a few years!)? Is there a way to describe a tool that is a combination of two things, if that is the case here? :thinking:

MaxHalford commented 2 years ago

Sorry for the late answer!

the last bit does sound a bit like Online ML

Indeed the second approach @ggeorgakoudis describes is a form of online learning. That's because reinforcement learning can be seen as a form of online learning.

Is there a way to describe a tool that is a combination of two things, if that is the case here? 🤔

I'm not sure. I would say that what both approach have in common is that they're machine learning approaches. I think what really stands out here is the notion of "adaptability" of Apollo. That's what I would advertise if I were you, but you know best :)

vsoch commented 2 years ago

No worries @MaxHalford thanks for your insight! I also like adaptability as a descriptor - @ggeorgakoudis would that be something we could add?

LLNL / apollo

Update description #19