Closed antoinecarme closed 2 years ago
We use a random forest (512 trees ;) on the iris dataset.
The Model C++ code is stored under /tmp/sklearn2sql_cpp_140409960179088_model_specific.i
With Boost.Python this code can be encapsulated/pythonified and stored as /tmp/sklearn2sql_cpp_140409960179088.cpp
#include "Generic.i"
#include "/tmp/sklearn2sql_cpp_140409960179088_model_specific.i"
#include <boost/python.hpp>
using namespace boost::python;
BOOST_PYTHON_MODULE(sklearn2sql_cpp_140409960179088) {
def("score_csv_file", score_csv_file);
}
We compile this into a shared library /tmp/sklearn2sql_cpp_140409960179088.so that can be loaded as a python module
g++ -I/usr/include/python3.10 -Wno-unused-function -fPIC -std=c++17 -g -o /tmp/sklearn2sql_cpp_140409960179088.o -c /tmp/sklearn2sql_cpp_140409960179088.cpp
g++ /tmp/sklearn2sql_cpp_140409960179088.o -shared -Wl,--export-dynamic -lboost_python310 -L/usr/lib/python3.10/config -lpython3.10 -o /tmp/sklearn2sql_cpp_140409960179088.so
Sample python deployment code :
The following python code can be used to score a given CSV file
import sys
sys.path = sys.path + ['/tmp']
import sklearn2sql_cpp_140409960179088 as mymodel
result = mymodel.score_csv_file("/tmp/iris.csv") # returns a python string
print(result)
Sample output :
idx,Score_0,Score_1,Score_2,Proba_0,Proba_1,Proba_2,LogProba_0,LogProba_1,LogProba_2,Decision,DecisionProba
0,,,,1.00000000000000,0.00000000000000,0.00000000000000,0.00000000000000,-32.23619130191664,-32.23619130191664,0,1.00000000000000
1,,,,0.99804687500000,0.00195312500000,0.00000000000000,-0.00195503483580,-6.23832462503951,-32.23619130191664,0,0.99804687500000
2,,,,1.00000000000000,0.00000000000000,0.00000000000000,0.00000000000000,-32.23619130191664,-32.23619130191664,0,1.00000000000000
3,,,,1.00000000000000,0.00000000000000,0.00000000000000,0.00000000000000,-32.23619130191664,-32.23619130191664,0,1.00000000000000
4,,,,1.00000000000000,0.00000000000000,0.00000000000000,0.00000000000000,-32.23619130191664,-32.23619130191664,0,1.00000000000000
Boost.Python can be used to transform the C++ model code into a python module that can be imported.
https://www.boost.org/doc/libs/1_63_0/libs/python/doc/html/index.html
Follow the six steps described in https://github.com/antoinecarme/ml2cpp/issues/1
This method can be used for deploying a model without having numpy/scikit-learn/R/keras installed (see #25 with MicroPython).