neurodata / scikit-learn

scikit-learn-tree fork: A fork that enables extensions of Python and Cython API for decision trees
https://scikit-learn.org
BSD 3-Clause "New" or "Revised" License
7 stars 6 forks source link

ENH add partial_fit for DecisionTreeClassifier #50

Closed PSSF23 closed 1 year ago

PSSF23 commented 1 year ago

@adam2392 I have revamped the update cython function into build as we discussed in #35. Right now I only modified DepthFirstTreeBuilder as there's no way to control max depth in streaming trees.

github-actions[bot] commented 1 year ago

❌ Linting issues

This PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling pre-commit hooks. Instructions to enable them can be found here.

You can see the details of the linting issues under the lint job here


black

black detected issues. Please run black . locally and push the changes. Here you can see the detected issues. Note that running black might also fix some of the issues which might be detected by ruff. Note that the installed black version is black=23.3.0.

``` --- /home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_classes.py 2023-08-11 14:57:30.408212 +0000 +++ /home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_classes.py 2023-08-11 14:58:02.573962 +0000 @@ -561,11 +561,13 @@ max_depth, max_leaf_nodes, self.min_impurity_decrease, self.store_leaf_values, ) - self.builder_.build(self.tree_, X, y, sample_weight, missing_values_in_feature_mask) + self.builder_.build( + self.tree_, X, y, sample_weight, missing_values_in_feature_mask + ) if self.n_outputs_ == 1 and is_classifier(self): self.n_classes_ = self.n_classes_[0] self.classes_ = self.classes_[0] would reformat /home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_classes.py Oh no! 💥 💔 💥 1 file would be reformatted, 908 files would be left unchanged. ```

ruff

ruff detected issues. Please run ruff --fix --show-source . locally, fix the remaining issues, and push the changes. Here you can see the detected issues. Note that the installed ruff version is ruff=0.0.284.

``` sklearn/tree/_classes.py:566:89: E501 Line too long (92 > 88 characters) | 564 | self.store_leaf_values, 565 | ) 566 | self.builder_.build(self.tree_, X, y, sample_weight, missing_values_in_feature_mask) | ^^^^ E501 567 | 568 | if self.n_outputs_ == 1 and is_classifier(self): | Found 1 error. ```

Generated for commit: fcc0758. Link to the linter CI: here

PSSF23 commented 1 year ago

Thanks for the notes! I'm still cleaning up the merging conflicts and will get back to your reviews once I have the code running again.

adam2392 commented 1 year ago

Kay I cleaned the diff by merging in changes from sklearn:main to submodulev2 and then to this branch.

Do you have unit-tests you can port from your old PR? The CIs can help check these whenever we get new stuff that way.