thesps / conifer

Fast inference of Boosted Decision Trees in FPGAs
Apache License 2.0
40 stars 22 forks source link

Manually unrolled HLS #41

Closed thesps closed 1 year ago

thesps commented 1 year ago

This PR is inspired by #30.

In the main branch, the loop over ensemble trees in the HLS is like this (simplified):

for(int i = 0; i < n_trees; i++){
    tree_scores[i] = trees[i].decision_function(x);
}

Vivado/Vitis HLS seems to have a hard time unrolling this loop, for an unknown reason. In this PR we manually unroll this loop in the backend writer instead. So for a model with 5 trees this looks like (simplified):

  scores[0] = tree_0.decision_function(x);
  scores[1] = tree_1.decision_function(x);
  scores[2] = tree_2.decision_function(x);
  scores[3] = tree_3.decision_function(x);
  scores[4] = tree_4.decision_function(x);

Now we also need to create instances of each tree (tree_0, tree_1 etc) instead of an array of them in parameters.h, and create one model specific function with the unrolled inference code.

The impact is:

For large models (e.g. in the scan referenced below with 100 trees and depth of 8) master branch may not be able to synthesize the model while this PR can. The old implementation with a for loop in the C++ is still available, with a new configuration parameter 'Unroll' made available to select between the implementations (defaults to True).

For reference, here are some plots from a scan of randomized trees (blue markers are master branch, red markers are this PR):

Screenshot 2023-03-30 at 15 09 48 Screenshot 2023-03-30 at 15 11 01
Screenshot 2023-03-30 at 14 58 17 Screenshot 2023-03-30 at 14 58 25
brightneedle commented 7 months ago

Is there a particular version of XGBoost/Scikit-Learn used here?

Trying out the example pruned_xgboost_to_hls.py:

With XGBoost 2.0.0 I get AttributeError: 'list' object has no attribute 'get' With XGBoost 1.7.6 this changes to in addParentAndDepth parents[j] = i IndexError: list assignment index out of range

In either case I have not been able to get pruned trees to compile.

thesps commented 7 months ago

This is likely an issue with the XGBoost converter rather than the pruned model support. I can confirm though that this was tested with XGBoost 1.7.5. I just tried with 2.0.0 and I see the same issue as you with the XGBoost converter tests. I'm not sure about the issue you see with 1.7.6, we can look into that. It might be a corner case only in that model.

I'll open a new issue about updating the XGBoost converter again for 2.0.0

brightneedle commented 7 months ago

Thanks @thesps! In the mean time can you confirm that the XGBoost converter works for pruned trees with 1.7.5? I still get parents[j] = i IndexError: list assignment index out of range with pruned_xgboost_to_hls.py

For completeness I use scikit-learn==1.3.0, conifer==1.2

thesps commented 7 months ago

Thanks for the info, I checked that example as well and I replicated your error. I haven't worked out the fix yet but I can give some insight into the issue: Since the error comes from the XilinxHLS backend, I converted the model with no configuration (Python backend) which is available in master branch but not yet in a release (will be in v1.3). Then I check if there are some trees with child indices out of range of the number of nodes:

model = conifer.converters.convert_from_xgboost(bst)
np.any([(np.any(np.array(tree.children_left) > tree.n_nodes()), np.any(np.array(tree.children_right) > tree.n_nodes())) for trees in model.trees for tree in trees])
True

So somehow the xgboost converter has read those out of range values from the model. This is where those child indices are set: https://github.com/thesps/conifer/blob/master/conifer/converters/xgboost.py#L51-L52

I will keep investigating!