thesps / conifer

Fast inference of Boosted Decision Trees in FPGAs
Apache License 2.0
40 stars 22 forks source link

RF results #63

Open kdruart29 opened 5 months ago

kdruart29 commented 5 months ago

Hello there,

I currently am experimenting RandomForests from sklearn into a ZCU102 board. I first tried with the classic HLS/Vivado/Vitis flow but was struggling with the results. I tried using pynq + the hls accelerator and my results are still weird.

So, for the example I am using the basic wine dataset from sklearn, with a RF (100 trees with a max depth of 100). With sklearn I obtain these predictions: (using clf.predict_proba), which are fine [0.97 0.03 0. ] [0.93 0.05 0.02] [0.06 0.12 0.82] [0.91 0.08 0.01] [0.07 0.85 0.08]

Then, with the model converted and compiled I obtain this : (using model.decision_function) [ 8.59375000e-01 6.23525391e+01 2.60214844e+01] [ 7.51953125e-01 -3.56474609e+01 2.61230469e+01] [ 1.75781250e-01 8.43525391e+01 2.62246094e+01] [ 7.03125000e-01 -8.66474609e+01 2.63261719e+01] [ 2.83203125e-01 -9.96474609e+01 2.64277344e+01] These results are strange and I don't understand them, what would be the explanation about them ?

Finally, on the PL, here are the results provided by accelerator.decision_function(np.float32(X_test)) [0.859375 0. 0. ] [0.7519531 0. 0. ] [0.17578125 0. 0. ] [0.703125 0. 0. ] [0.28320312 0. 0. ] These one correspond to the precedent results given by the converted model.

For the conversion I used the examples : clf = RandomForestClassifier(n_estimator=100, max_depth=100) clf.fit(X_train, X_test)

cfg = conifer.backends.xilinxhls.auto_config() accelerator_config = {'Board' : 'zcu102', 'InterfaceType': 'float'} cfg['AcceleratorConfig'] = acceleratorconfig cfg['OutputDir'] = 'prj{}'.format(int(datetime.datetime.now().timestamp()))

model = conifer.converters.convert_from_sklearn(clf, cfg) model.compile()

y_hls = model.decision_function(X_test) y_skl = clf.predict_proba(X_test)

model.build(bitfile=True, package=True)

What am I doing wrong ? Thank you in advance

thesps commented 4 months ago

Hi, thanks for reaching out.

I think there are a few things going on, but it seems to me that the Random Forest conversion is not working correctly, at least for multi-class problems. I tried working with the same wine dataset and see similar nonsense results to yours, and I can see 'missing' trees in the converted model firmware under firmware/parameters.h (missing tree indices). For a binary classification example the results looked more compatible between sklearn and the conifer HLS.

One effect that is smaller, but would eventually need to be taken into account for this dataset is the data types. The defaults probably don't work well for the features in this case. In general this is dataset dependent, but for the wine example a better configuration might be:

# Create a conifer config
cfg = conifer.backends.xilinxhls.auto_config(granularity='full')
cfg['InputPrecision'] = 'ap_fixed<18,16>'
cfg['ThresholdPrecision'] = 'ap_fixed<18,16>'
cfg['ScorePrecision'] = 'ap_fixed<18,8,AP_RND_CONV,AP_SAT>'

Besides your issue, it seems that you used the accelerator support and ran on a device. Since this is a quite new feature I'm also looking for feedback on that part of the workflow. Was it easy enough to make the bitfile and run it on the board?

kdruart29 commented 4 months ago

Hi!

Actually the conversion is doing great, the trees are correctly saved in the parameters.h file. The issue is with how RF and BDT are implemented in Sklearn In Sklearn, BDT are converted into subtrees for each class in each estimator wheras RF use a single tree, so the BDT_rolled.cpp can't do the other classes because it expects subtrees for each class. I solved this by modifying the way the value field is converted, adapted the BDT header and cpp file to accept the multiclass RF. Issue is it's not compatible with BDT now, just RF for my case. I plan on commiting my code when fully compatible.

The accelerator workflow is surprisingly easy and it works very well. The only difficulty was to find a compatible image of pynq for my zcu102.