Closed siboehm closed 2 years ago
This is exactly this issue: https://github.com/dmlc/treelite/issues/277
Minimally reproducible example:
import os
import lleaves
import lightgbm as lgb
import numpy.testing as npt
os.environ["LLEAVES_PRINT_UNOPTIMIZED_IR"] = "1"
os.environ["LLEAVES_PRINT_ASM"] = "1"
model_path = "faulty_model.txt"
llvm_model = lleaves.Model(model_file=model_path)
llvm_model.compile()
lgbm_model = lgb.Booster(model_file=model_path)
data = [[float("NaN")]]
npt.assert_equal(llvm_model.predict(data), lgbm_model.predict(data))
lleaves returns 7.0, lightgbm returns 6.5
This happens because we cast the fp-inputs to int using LLVM's fptosi
. This yields poison for NaN, which just happened to work out correctly on x86 backend. Instead the cast to int should be moved into the decision node, performing a NaN check before doing the cast.
When running the (non-benchmark) test suite, 11 out of 90 tests are failing on an ARM M1 MBP, while the x86 CI continues running without errors. Seems to be related to fp NaN handling on ARM, haven't looked closely yet.