NiklasPfister / adaXT

adaXT: tree-based machine learning in Python
https://niklaspfister.github.io/adaXT/
BSD 3-Clause "New" or "Revised" License
7 stars 1 forks source link

Cleanup and fixes #35

Closed WilliamHeuser closed 9 months ago

WilliamHeuser commented 10 months ago

This branch cleans up the code base along with some small improvements.

In general I did see a slight increase in running time when fitting using this branch. Not anything major, but still might be worth addressing.

WilliamHeuser commented 9 months ago

I have tested the runtime of fitting for

  1. The new splitter from this PR
  2. The old splitter from main branch Which yielded these results.

With the new splitter from this PR: classification_with_splitter_from_cleanup

With the old splitter from main: classification_with_splitter_from_main

I would rather use the old splitter as it is faster and simpler, but since both pass all tests I am open to both. What do you think @svbrodersen and @NiklasPfister ?

WilliamHeuser commented 9 months ago

As we talked about at our meeting we had to do some changes to how DecisionTree and DepthTreeBuilder works to ease the implementation of RF. I thought it would be a good idea to get these into this PR as we want to get the project on PyPI now, thus we would not have to change how DecisionTree works after it is published.

The main difference is that arguments go from being in the .fit function of a DecisionTree, to being in the init function of a DecisionTree.

Would you look things through again @svbrodersen and @NiklasPfister to see if you agree with these changes? I only made changes to DecisionTree.pyi, DecisionTree.pyx, DepthTreeBuilder.py, splitter.pxd, splitter.pyx