JuliaAI / DecisionTree.jl

Julia implementation of Decision Tree (CART) and Random Forest algorithms
Other
356 stars 102 forks source link

Why regression used in apply_forest only if type of labels in model is Float64? #225

Open xinadi opened 1 year ago

xinadi commented 1 year ago

Hi, I found that for regression algorithm in apply_forest function (mean) the labels type T of model should be exact Float64:

function apply_forest(forest::Ensemble{S, T}, features::AbstractVector{S}) where {S, T}
    n_trees = length(forest)
    votes = Array{T}(undef, n_trees)
    for i in 1:n_trees
        votes[i] = apply_tree(forest.trees[i], features)
    end
    if T <: Float64
        return mean(votes)
    else
        return majority_vote(votes)
    end
end

Is there any particular reason why condition is not T <: AbstractFloat? Also, the documentation noted that regression choosed when labels/targets of type Float, not Float64. Thanks!

rikhuijzer commented 1 year ago

You are probably right. Well spotted! If you want, you can open a PR to fix this. If the tests pass, then it will most likely be merged. (Optionally, you can also do a search on the codebase for any other cases of Float64 matches to check whether there are not more cases like this.)

ablaom commented 1 year ago

It would be good to have a more generic implementation. However, it seems that a refactor may be a little more involved. See, for example:

https://github.com/JuliaAI/DecisionTree.jl/blob/605e4d41deaa547462f71b5bd05a9a16ad682b15/src/regression/tree.jl#L16