BayesWitnesses / m2cgen

Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
MIT License
2.82k stars 241 forks source link

Memory overflow in program run after compilation of C code transformed by the xgboost model #588

Open yhfwww opened 8 months ago

yhfwww commented 8 months ago

`#ifndef linux

include "stdafx.h"

endif

include

include

double g_array[2] = { 0.0, 0.0 }; double make_array(double x, double y) { g_array[0] = x; g_array[1] = y; return g_array; } void score(double input, double output) { double var0; if ((input[487]) >= (0.5)) { if ((input[149]) >= (0.5)) { if ((input[69]) >= (0.5)) { if ((input[396]) >= (1.5)) { if ((input[379]) >= (40.5)) { if ((input[436]) >= (67.5)) { if ((input[65]) >= (32.5)) { var0 = -0.4347826; ... ... ... ... ... ...
} } } } double var55; var55 = (1.0) / ((1.0) + (exp((0.0) - (((((((((((((((((((((((((((((((((((((((((((((((((((((((var0) + (var1)) + (var2)) + (var3)) + (var4)) + (var5)) + (var6)) + (var7)) + (var8)) + (var9)) + (var10)) + (var11)) + (var12)) + (var13)) + (var14)) + (var15)) + (var16)) + (var17)) + (var18)) + (var19)) + (var20)) + (var21)) + (var22)) + (var23)) + (var24)) + (var25)) + (var26)) + (var27)) + (var28)) + (var29)) + (var30)) + (var31)) + (var32)) + (var33)) + (var34)) + (var35)) + (var36)) + (var37)) + (var38)) + (var39)) + (var40)) + (var41)) + (var42)) + (var43)) + (var44)) + (var45)) + (var46)) + (var47)) + (var48)) + (var49)) + (var50)) + (var51)) + (var52)) + (var53)) + (var54))))); memcpy(output, make_array((1.0) - (var55), var55), 2
sizeof(double)); }

`

I converted the xgboost binary classification model to C++ code, compiled it into an exe program, ran it on a large scale dataset and found that the program ran out of memory, the memory usage went up from the initial 30MB to 1.5GB, and then ran out of memory, and after debugging I found that it seems to be this line of code that causes the memory to keep increasing:

memcpy(output, make_array((1.0) - (var55), var55), 2 * sizeof(double));

yhfwww commented 8 months ago

This has now been resolved and was not caused by the generated code