Closed Ilia-Shutov closed 11 months ago
Pushed Python part, no new tests were added, checked manually only.
Haven't changed code much, because as I understand there is pending PR from @petrovicboban related to Python part soon.
Few comments about current pybind11 C++, what can be simplified:
1) I'm sure that extern "C" {
is not needed - as I understand, we're not building a library that should link to pybind11 build with a different compiler later, they're built together, even C++11 features could be used
2) I have an assumption why there are void*
instead of forestry*
- by default pybind11 tries to guess if the memory visible to Python (returned by a function) should be managed with Python garbage collector - and in case it is void*, it gives up and leave the memory management to C++, not deleting it.
This is fixed by py::return_value_policy::reference
, checked that without it, GC seems to delete the memory and tests crash immediately 100% time.
@Ilia-Shutov I'm very novice in C++, what you can see there is just my best effort. If there is anything that you think you can improve, feel free to submit PR, and if tests are still succeeding, you should go for it. Btw, we don't have unit tests in C++, that's something which should be done eventually.
It seems also that you have domain knowledge, so feel free to explore sklearn compat branch, hopefully you can improve it. I'm devops guy only, working on CI/CD and general python, I don't have any DS domain skill and knowledge.
Nice, if possible, let's add a basic Python test since the core function behavior should not change. Can you create a separate PR for the extern "C"
change? Generally, if the pipeline tests complete, we should be good to merge in the clean-up.
@edwardwliu Please don't merge yet, I've found a missing void -> forestry place which should be fixed or it will fail during serialisation/deserialisation, will push a commit soon
UPD: fixed. Serialization/Deserialization test case is a part of the next PR related to sklearn
@Ilia-Shutov saw the new commit, is this ready to squash and merge?
exportTreeliteJson(C++), export_treelite_json(R) methods are introduced
New R package dependancy required for R tests running: rjson
linear forests are not supported - throwing an exception in that case.
Tested manually by importing and comparing prediction results using Python Treelite library (slight precision differences present)
R manual test script used:
Python manual test script used: