Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
I found an error when I run predictions under valgrind. According to valgrind log the error is somewhere in openmp. So I tried to build xgboost without openMP (-DUSE_OPENMP=0) and error is gone. I know there are some known false positive errors when running openmp under valgrind (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36298) but I want to be sure this errors are caused by openmp not by xgboost.
xgboost_valgrind_example.zip
Without openMP:
==7== Memcheck, a memory error detector
==7== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==7== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==7== Command: ./xgboost_valgrind
==7==
[08:25:42] WARNING: /xgboost/src/learner.cc:749: Found JSON model saved before XGBoost 1.6, please save the model using current version again. The support for old JSON model will be discontinued in XGBoost 2.3.
==7==
==7== HEAP SUMMARY:
==7== in use at exit: 0 bytes in 0 blocks
==7== total heap usage: 262,933 allocs, 262,933 frees, 45,755,755 bytes allocated
==7==
==7== All heap blocks were freed -- no leaks are possible
==7==
==7== For lists of detected and suppressed errors, rerun with: -s
==7== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
With openMP:
==7== Memcheck, a memory error detector
==7== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==7== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==7== Command: ./xgboost_valgrind
==7==
[08:23:52] WARNING: /xgboost/src/learner.cc:749: Found JSON model saved before XGBoost 1.6, please save the model using current version again. The support for old JSON model will be discontinued in XGBoost 2.3.
==7==
==7== HEAP SUMMARY:
==7== in use at exit: 7,856 bytes in 15 blocks
==7== total heap usage: 262,957 allocs, 262,942 frees, 45,825,843 bytes allocated
==7==
==7== 3,520 bytes in 11 blocks are possibly lost in loss record 4 of 5
==7== at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==7== by 0x40147D9: calloc (rtld-malloc.h:44)
==7== by 0x40147D9: allocate_dtv (dl-tls.c:375)
==7== by 0x40147D9: _dl_allocate_tls (dl-tls.c:634)
==7== by 0x514A834: allocate_stack (allocatestack.c:430)
==7== by 0x514A834: pthread_create@@GLIBC_2.34 (pthread_create.c:647)
==7== by 0x52FD1EF: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==7== by 0x52F3A10: GOMP_parallel (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==7== by 0x4AE2B70: xgboost::gbm::GBTreeModel::LoadModel(xgboost::Json const&) (in /usr/local/lib/libxgboost.so)
==7== by 0x4AB9DB1: xgboost::gbm::GBTree::LoadModel(xgboost::Json const&) (in /usr/local/lib/libxgboost.so)
==7== by 0x4B02C57: xgboost::LearnerIO::LoadModel(xgboost::Json const&) (in /usr/local/lib/libxgboost.so)
==7== by 0x4B0BB2F: xgboost::LearnerIO::LoadModel(dmlc::Stream*) (in /usr/local/lib/libxgboost.so)
==7== by 0x10B581: main (main.cpp:13)
==7==
==7== LEAK SUMMARY:
==7== definitely lost: 0 bytes in 0 blocks
==7== indirectly lost: 0 bytes in 0 blocks
==7== possibly lost: 3,520 bytes in 11 blocks
==7== still reachable: 4,336 bytes in 4 blocks
==7== suppressed: 0 bytes in 0 blocks
==7== Reachable blocks (those to which a pointer was found) are not shown.
==7== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==7==
==7== For lists of detected and suppressed errors, rerun with: -s
==7== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
I made small example in docker
docker build . -t valgrind_example
docker run -it --rm valgrind_example
xgboost: 1.6.1 model: 1.1.1 ubuntu 22.04
I found an error when I run predictions under valgrind. According to valgrind log the error is somewhere in openmp. So I tried to build xgboost without openMP (
-DUSE_OPENMP=0
) and error is gone. I know there are some known false positive errors when running openmp under valgrind (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36298) but I want to be sure this errors are caused by openmp not by xgboost. xgboost_valgrind_example.zipWithout openMP:
With openMP:
I made small example in docker