guitargeek / XGBoost-FastForest

Minimal library code to deploy XGBoost models in C++.
MIT License
89 stars 30 forks source link

GCC version requirements? #10

Closed FzuGsr closed 4 years ago

FzuGsr commented 4 years ago

Hello, When I run regression model with gcc version 4.9.2 (Debian 4.9.2-10+deb8u2), the results are inconsistent. But compare with Python gcc version 8.3.0 (Debian 8.3.0-6) is correct com

How do I run fastforest on gcc4.9.2 ? thank you!

guitargeek commented 4 years ago

Hi! I got to ask for clarification here. So if you compile fastforest with gcc 8.3.0 you get the correct results comparing with Python, but when you compile with gcc 4.9.2 you don't?

And when you say the results are inconsistent: to what degree? Numerical inconsistencies that can be attributed to different floating-point formats, or significant differences where the fastforest output is completely off?

Thanks for using the library anyway!

FzuGsr commented 4 years ago

Hi! I got to ask for clarification here. So if you compile fastforest with gcc 8.3.0 you get the correct results comparing with Python, but when you compile with gcc 4.9.2 you don't?

And when you say the results are inconsistent: to what degree? Numerical inconsistencies that can be attributed to different floating-point formats, or significant differences where the fastforest output is completely off?

Thanks for using the library anyway!

Thank you for your reply.

  1. Yes, gcc 4.9.2 get wrong result.
  2. example: python result:array([0.30929977], dtype=float32) gcc 8.3.0 resutl: 0.3093 gcc 4.9.2 result:0.383588
FzuGsr commented 4 years ago

Hi! I got to ask for clarification here. So if you compile fastforest with gcc 8.3.0 you get the correct results comparing with Python, but when you compile with gcc 4.9.2 you don't? And when you say the results are inconsistent: to what degree? Numerical inconsistencies that can be attributed to different floating-point formats, or significant differences where the fastforest output is completely off? Thanks for using the library anyway!

Thank you for your reply.

  1. Yes, gcc 4.9.2 get wrong result.
  2. example: python result:array([0.30929977], dtype=float32) gcc 8.3.0 resutl: 0.3093 gcc 4.9.2 result:0.383588

and my model.txt has 1.4G and 440 features

guitargeek commented 4 years ago

Hi! The bug with gcc49 is now fixed. For some reason, filing some unordered_maps is not working well with the old compiler, but I didn't really bother finding the exact reason. There is now a workaround implemented.

FzuGsr commented 4 years ago

Hi! The bug with gcc49 is now fixed. For some reason, filing some unordered_maps is not working well with the old compiler, but I didn't really bother finding the exact reason. There is now a workaround implemented.

Thank you!I tried it and it was right!

cherish0 commented 4 years ago

Hi! The bug with gcc49 is now fixed. For some reason, filing some unordered_maps is not working well with the old compiler, but I didn't really bother finding the exact reason. There is now a workaround implemented.

It is not a bug. Your code is an UB.

unordered_map<int, int> m;  
m[1] = m.size();  
// m[1] == 1 in low version gcc  
// m[1] == 0 in high version gcc  

Cannot change the size of a C++ container, and read the size in the same statement.

The evaluation order was undefined.

The simplest way to fix this problem:

//leafIndices[index] = leafIndices.size() + nPreviousLeaves;
auto size = leafIndices.size();
leafIndices[index] = size + nPreviousLeaves;
guitargeek commented 4 years ago

Excellent, thank you @cherish0!

I implemented it in the latest commit: https://github.com/guitargeek/XGBoost-FastForest/commit/a023c52e169694fa4f56b2e2f6e50ce06d4d9894