BayesWitnesses / m2cgen

Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
MIT License
2.8k stars 241 forks source link

XGBoost GCC Memory Allocation Error: cc1plus.exe: out of memory allocating 65536 bytes #590

Open awt42069 opened 5 months ago

awt42069 commented 5 months ago

Hi,

I am having a GCC compiler problem when deploying XGBoost C models that were converted using m2cgen. This occurs with at least gcc9 and gcc12 that I have tested, but not a problem with MSVC. The error in particular:

[build] cc1plus.exe: out of memory allocating 65536 bytes

Changing the build type from Release to Debug will remove the error. Also, reducing the optimization to O1 can sometimes fix the problem depending on model size, likely keeping it just under the 3GB limit. The embedded platform is Linux based so we must use GCC and to meet timing would prefer O2 optimization.

I have tried to make the address space larger to 3GB using this approach: https://www.intel.com/content/www/us/en/support/programmable/articles/000086946.html,

That does not solve the problem since models may become very large. The .c files is ~20MB but could be larger depending on the early stopping criteria. Reducing the model size is an option but that will effect performance negatively.

The files are included in the following way, in the main area: extern "C" {

include "classifiers/xgboost_include_models.c"

}

then in classifiers/xgboost_include_models.c:

pragma once

include "./large_xgboost_model.c"

the m2cgen inference function inside large_xgboost_model.c is defined by: inline void predict_proba_my_inference(float input, float output) { } inline void softmax_0(float x, int size, float result) { }

Shallow learning models are ideal for small SWaP Linux deployments so this would be a good capability to have for XGBoost/m2cgen conversion. I am wondering if it is possible to re-factor the auto-generated code into several smaller functions, include the models in a different way, or any other possible option to compile larger, optimized, models in release mode.

Let me know if you have any suggestions or questions.