cstjean / ScikitLearn.jl

Julia implementation of the scikit-learn API https://cstjean.github.io/ScikitLearn.jl/dev/
Other
546 stars 75 forks source link

Can I use SelectFromModel with DecisionTree? #60

Open Thuener opened 5 years ago

Thuener commented 5 years ago

I'm trying to use SelectFromModel with RadomForestClassifier. There is support for that on ScickitLearn in Julia?

using RDatasets: dataset
using ScikitLearn, DecisionTree

iris = dataset("datasets", "iris")
X = convert(Array, iris[[:SepalLength, :SepalWidth, :PetalLength, :PetalWidth]])
y = convert(Array, iris[:Species])
@sk_import ensemble: RandomForestClassifier
@sk_import feature_selection: SelectFromModel

rfc = RandomForestClassifier(n_subfeatures=30, n_trees=350, partial_sampling = 0.4, min_purity_increase = 0.001)
sfm = SelectFromModel(rfc)
fit!(sfm, X, y)

I get the following error:

ERROR: PyError ($(Expr(:escape, :(ccall(#= /home/tas/.julia/packages/PyCall/ttONZ/src/pyfncall.jl:44 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) <class 'TypeError'> TypeError("Cannot clone object '<PyCall.jlwrap RandomForestClassifier\nn_trees: 350\nn_subfeatures: 30\npartial_sampling: 0.4\nmax_depth: -1\nmin_samples_leaf: 1\nmin_samples_split: 2\nmin_purity_increase: 0.001\nclasses: ensemble: >' (type <class 'PyCall.jlwrap'>): it does not seem to be a scikit-learn estimator as it does not implement a 'get_params' methods.") File "/home/tas/.julia/conda/3/lib/python3.7/site-packages/sklearn/feature_selection/frommodel.py", line 195, in fit self.estimator = clone(self.estimator) File "/home/tas/.julia/conda/3/lib/python3.7/site-packages/sklearn/base.py", line 60, in clone % (repr(estimator), type(estimator)))

Stacktrace: [1] pyerr_check at /home/tas/.julia/packages/PyCall/ttONZ/src/exception.jl:60 [inlined] [2] pyerr_check at /home/tas/.julia/packages/PyCall/ttONZ/src/exception.jl:64 [inlined] [3] macro expansion at /home/tas/.julia/packages/PyCall/ttONZ/src/exception.jl:84 [inlined] [4] __pycall!(::PyCall.PyObject, ::Ptr{PyCall.PyObject_struct}, ::PyCall.PyObject, ::Ptr{Nothing}) at /home/tas/.julia/packages/PyCall/ttONZ/src/pyfncall.jl:44 [5] _pycall!(::PyCall.PyObject, ::PyCall.PyObject, ::Tuple{Array{Float64,2},Array{String,1}}, ::Int64, ::Ptr{Nothing}) at /home/tas/.julia/packages/PyCall/ttONZ/src/pyfncall.jl:29 [6] _pycall!(::PyCall.PyObject, ::PyCall.PyObject, ::Tuple{Array{Float64,2},Array{String,1}}, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/tas/.julia/packages/PyCall/ttONZ/src/pyfncall.jl:11 [7] #call#111(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::PyCall.PyObject, ::Array{Float64,2}, ::Vararg{Any,N} where N) at /home/tas/.julia/packages/PyCall/ttONZ/src/pyfncall.jl:89 [8] (::PyCall.PyObject)(::Array{Float64,2}, ::Vararg{Any,N} where N) at /home/tas/.julia/packages/PyCall/ttONZ/src/pyfncall.jl:89 [9] #fit!#31(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::PyCall.PyObject, ::Array{Float64,2}, ::Vararg{Any,N} where N) at /home/tas/.julia/packages/ScikitLearn/bo2Pt/src/Skcore.jl:100 [10] fit!(::PyCall.PyObject, ::Array{Float64,2}, ::Array{String,1}) at /home/tas/.julia/packages/ScikitLearn/bo2Pt/src/Skcore.jl:100 [11] top-level scope at none:0

PS: There is no issue if I use RandomForestClassifier from ScikitLearn.

alexmorley commented 5 years ago

Can you post the full error? And ideally some example matrices for X and Y?

Thuener commented 5 years ago

Sorry, I have to be more clear about the issue. I will edit the text in order to construct a more detailed explanation.