JuliaAI / DecisionTree.jl

Julia implementation of Decision Tree (CART) and Random Forest algorithms
Other
356 stars 102 forks source link

Add implementation of AbstractTrees-interface #158

Closed roland-KA closed 2 years ago

roland-KA commented 2 years ago
ablaom commented 2 years ago

@roland-KA My guess about the fail is that it is caused by the existence of a Manifest.toml file in the repo, something that has previously been overlooked. (Manifests generated by different julia versions are not necessarily compatible.) This file should be removed, so that it is generated from scratch in CI every time.

You will want to add Manifest.toml to the .gitignore, so that it is not accidentally added to the git index again.

ablaom commented 2 years ago

Looks like you still have the Manifest.toml in your branch: https://github.com/roland-KA/DecisionTree.jl/tree/abstract-tree

codecov-commenter commented 2 years ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (dev@17cb46d). Click here to learn what that means. The diff coverage is n/a.

:exclamation: Current head eabad6a differs from pull request most recent head d511ed3. Consider uploading reports for the commit d511ed3 to get more accurate results

@@          Coverage Diff           @@
##             dev     #158   +/-   ##
======================================
  Coverage       ?   89.51%           
======================================
  Files          ?       10           
  Lines          ?      992           
  Branches       ?        0           
======================================
  Hits           ?      888           
  Misses         ?      104           
  Partials       ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 17cb46d...d511ed3. Read the comment docs.

roland-KA commented 2 years ago

I forgot to remove and untracked Manifest.toml explicitly. Only adding it to .gitignore is not enough. That should be fine now.

The remaining CI error on "Julia 1.0 - windows-latest" doesn't seem to be related to our recent changes. It occurs during the iris.jl-test:

iris.jl: Test Failed at D:\a\DecisionTree.jl\DecisionTree.jl\test\classification\iris.jl:101
[474](https://github.com/JuliaAI/DecisionTree.jl/runs/6367331199?check_suite_focus=true#step:6:474)
  Expression: mean(accuracy) > 0.9
[475](https://github.com/JuliaAI/DecisionTree.jl/runs/6367331199?check_suite_focus=true#step:6:475)
   Evaluated: 0.9 > 0.9
ablaom commented 2 years ago

Yes, the fail looks unrelated.

I have a suggestion regarding the output of printnode. Currently we have:

firstFt --> 0.7
├─ a (2/3)
└─ secondFt --> 0.5
   ├─ b (2/3)
   └─ c (2/3)

I find the "-->" a bit cryptic/uninformative. Can I suggest either:

firstFt <  0.7 ?
├─ a (2/3)
└─ secondFt < 0.5 ?
   ├─ b (2/3)
   └─ c (2/3)

or maybe,

firstFt: Threshold 0.7
├─ a (2/3)
└─ secondFt: Threshold 0.5
   ├─ b (2/3)
   └─ c (2/3)

with a preference for the first.

roland-KA commented 2 years ago

Good point! 😊

I've tried several variations during implementation, among them '<'. But as I didn't find any documentation stating clearly that '<' and not '<=' is correct, I abandoned the idea. With the new doc-string you've added to the built-in print_tree function, things are much clearer now. So I will change it accordingly. 👍

roland-KA commented 2 years ago

... and the CI error on "Julia 1.0 - windows-latest" didn't occur this time. May be some floating-point precision problem?

ablaom commented 2 years ago

@roland-KA I think we're almost there. My only remaining objection is the explicit references to MLJ. This package is completely independent of MLJ (and MLJ is not the only ML framework with a DecisionTree.jl interface). So I don't think it's appropriate. Anything MLJ-specific can be added to the model doc-strings at MLJDecisionTree.jl .

The intermittently failing test is unrelated, and too minor to disrupt this PR. If it recurs in aJulia 1.6 test, we can investigate.

roland-KA commented 2 years ago

@roland-KA I think we're almost there. My only remaining objection is the explicit references to MLJ. This package is completely independent of MLJ (and MLJ is not the only ML framework with a DecisionTree.jl interface). So I don't think it's appropriate. Anything MLJ-specific can be added to the model doc-strings at MLJDecisionTree.jl .

Ok, I understand your issue. I've deleted the direct reference to MLJ and left only the explanation of the general idea.

ablaom commented 2 years ago

@roland-KA I've belatedly noticed that @bensadeghi (the pkg author) has requested we add documentation in the README.md. I think this can be a one-liner just referencing your excellent doc-strings.

I promise this is the very last item 😳 . Thanks for your patience.

roland-KA commented 2 years ago

😀 ... no problem, here it is.

And yes, it would be good, if we could close the PR now, because I will be on vacation for the next two weeks 🚅 ☀️😊