Open 80LevelElf opened 11 months ago
And looks like MaximumMemoryUsageInMegaByte is not a restriction of train memory usage, but a restriction of whole process memory usage
During to smart training memory limitation (like in vowpal wabbit) - maybe train a little model to predict aprox training memory usage and handle training model count and using trainers because of that?)
Is your feature request related to a problem? Please describe.
Let's see the current training settings:
When the training takes more than 7500 Megabytes it will be canceled. But if we not set MaximumMemoryUsageInMegaByte this training take a lot of memory and in many case it will be more that our current pod memory ( > 36 Gb).
And during to logs it's often different amount of data. The very similar train set learning could set 10 Gb at first time and 30 Gb at second time.
Describe the solution you'd like It will be perfect to have memory limitation as max memory ml.net can use for training without canceling. Like we have limitation for 7500 Mb and 1 training takes 2500 Mb so let's start 3 models training.