JuliaAI / MLJ.jl

A Julia machine learning framework
https://juliaai.github.io/MLJ.jl/
Other
1.79k stars 157 forks source link

Add tools to estimate resource requirements #71

Open ablaom opened 5 years ago

jpsamaroo commented 5 years ago

I'll have a use for estimating resource needs in the future (for use as an objective variable in some multi-objective optimization problem, and also for preventing my Julia workers from strangling themselves). What ideas do you have in mind for making such estimates?

I could imagine moving averages over the results of @elapsed and @allocated would work well if you're already running a machine multiple times. However, it would be beneficial to also have a rough estimate available before the first run, especially if we consider using huge, automatically-generated Flux models which may be too large to run on any available worker (my usecase).

ablaom commented 5 years ago

These sound like good the ideas. I think there was also the idea of mapping out resource demands as more and more data rows are made available to the learning algorithm. I think MLR does something like this and calls it "learning curves".

In this regard there may be some overlap with API design for dealing with online/active learning, which I think going to be some work. Some discussion around this is at https://github.com/alan-turing-institute/MLJ.jl/issues/139#issuecomment-495926733, with an open issue at #60. Sorry, nonsense, since each time you grow the data you retrain from scratch.