JuliaAI / DecisionTree.jl

Julia implementation of Decision Tree (CART) and Random Forest algorithms
Other
356 stars 102 forks source link

document use_multithreading keyword #208

Closed rafaqz closed 1 year ago

rafaqz commented 1 year ago

The keyword is currently undocumented. Closes #134

ablaom commented 1 year ago

@rafaqz Thanks for this.

It seems to me that, after all, the use_mulithreading option is only implemented for apply_forest (which returns majority votes) but was never implemented for apply_forest_proba. So this PR appears to document a feature that does not (but should) exist.

I also can't find multithreading support for prediction in regression, only classification.

Would you agree with this assessment?

rafaqz commented 1 year ago

Oh hah sorry I just added it to the wrong method will fix

rafaqz commented 1 year ago

So it turns out apply_forest wasn't actually documented, thus the mistake in adding the docs. Can we add some docs for it? I don't actually know what to add where I left the placeholders.

For background on this, this came up while working with a research group switching to Julia who concluded this was much slower than ScikitLearn - because this keyword is not documented and they were running it on one thread against 8 for python.

codecov-commenter commented 1 year ago

Codecov Report

Merging #208 (9c29071) into dev (835f3cd) will increase coverage by 1.46%. The diff coverage is n/a.

:exclamation: Current head 9c29071 differs from pull request most recent head ed7e348. Consider uploading reports for the commit ed7e348 to get more accurate results

@@            Coverage Diff             @@
##              dev     #208      +/-   ##
==========================================
+ Coverage   87.99%   89.45%   +1.46%     
==========================================
  Files          10       10              
  Lines        1249     1176      -73     
==========================================
- Hits         1099     1052      -47     
+ Misses        150      124      -26     
Impacted Files Coverage Δ
src/classification/main.jl 97.90% <ø> (+1.79%) :arrow_up:
src/scikitlearnAPI.jl 51.26% <0.00%> (-0.41%) :arrow_down:
src/measures.jl 97.23% <0.00%> (-0.12%) :arrow_down:
src/util.jl 92.02% <0.00%> (+1.68%) :arrow_up:
src/regression/main.jl 92.00% <0.00%> (+3.53%) :arrow_up:
src/regression/tree.jl 100.00% <0.00%> (+5.03%) :arrow_up:
src/classification/tree.jl 100.00% <0.00%> (+5.18%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

ablaom commented 1 year ago

This is still the wrong method - it is apply_forest_proba, not apply_tree that has the parameter.

I agree these methods need doc-strings. Currently most of the documentation for DecisionTree.jl lives only on the README.md.

If you want to close this and just open an new documentation issue for someone else to address, then that's fine.

rafaqz commented 1 year ago

No, it's this method

https://github.com/JuliaAI/DecisionTree.jl/blob/dev/src/classification/main.jl#L453

ablaom commented 1 year ago

Right, sorry. It's implemented for apply_forest. But this PR adds docs to apply_tree, for which multithreading is not inmplemented (and is likely not needed).

rafaqz commented 1 year ago

No worries, I can't finish the empty doc anyway.

But we can't really close that other issue without documenting the keyword, people can't find that it exists.

rafaqz commented 1 year ago

Ok this actually finally documents the right function, and seems mergeable to me.