Open 80LevelElf opened 7 months ago
I have just try to switch to OneDAL mode for production. It doesn't help (
Maybe some temporary workarounds?
I think it's really a big problem regarding to ml.net 3 should be released this month.
I have tried it for new ML.net 3
The same behavior
@LittleLittleCloud @luisquintanilla
Hi friends! Maybe is there any workaround or any thinks we can check on our side?
So I have found out the problem - it is because of MaximumMemoryUsageInMegaByte = 7500
Just after starting the used memory become more that 7500 Mb and learning become canceled.
At first point it's understandable behavior, but it looks like very unuseful. In fact Ml.net doesn't rule memory consumption in our case. We have to choose between:
But can't ml.net control count of models to train at one time by memory limit? Like limit it 7500 Mb and one model need 2500 Mb to train - so let's start 3 models.
System Information (please complete the following information):
Describe the bug At this moment we use ML.net 2, but because of the bug fix of https://github.com/dotnet/machinelearning/pull/6571 we have to switch to 3 version of ML.net to train our Binary Classification models (we need Positive Recall optimization metric).
But looks like Binary Classification Experiment is somehow broken in 3 version of ML.net:
We use only FastForest and LightGBM trainers. On my local PC (Windows 10) it's working great, but in the production docker image (Alpine Linux) the learning is finished after 10-30 seconds with:
I have tried to:
But nothing is working for me. Important point - MLNET_BACKEND is not set so we are not using OneDAL on production or test environment.