cstjean / ScikitLearn.jl

Julia implementation of the scikit-learn API https://cstjean.github.io/ScikitLearn.jl/dev/
Other
547 stars 75 forks source link

loosen bounds #119

Closed OkonSamuel closed 1 year ago

OkonSamuel commented 1 year ago

At the moment, ScikitLearn.jl limits the conda sklearn version to <1.1 for Linux users. This was due to the libstdxx issue with Julia, which has long been fixed for later Julia versions >=v0.8.4. This PR proposes loosens the bounds of the conda sklearn version. It also makes sure that compatible conda sklearn versions are installed across Windows, macOS, and linux platforms. At the moment this corresponds to version >=1.2, <1.3 (This was chosen, since a minor release may contain several code deprecations). We could change this bound when python sklearn releases a new minor release, which doesn't happen often.

cc. @cstjean, @tylerjthomas9 @ablaom

codecov-commenter commented 1 year ago

Codecov Report

Base: 68.95% // Head: 69.01% // Increases project coverage by +0.05% :tada:

Coverage data is based on head (56fb216) compared to base (55ee9e6). Patch coverage: 55.00% of modified lines in pull request are covered.

:exclamation: Current head 56fb216 differs from pull request most recent head ee0dfbe. Consider uploading reports for the commit ee0dfbe to get more accurate results

:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #119 +/- ## ========================================== + Coverage 68.95% 69.01% +0.05% ========================================== Files 14 14 Lines 786 781 -5 ========================================== - Hits 542 539 -3 + Misses 244 242 -2 ``` | [Impacted Files](https://codecov.io/gh/cstjean/ScikitLearn.jl/pull/119?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None) | Coverage Δ | | |---|---|---| | [src/Skcore.jl](https://codecov.io/gh/cstjean/ScikitLearn.jl/pull/119?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL1NrY29yZS5qbA==) | `78.00% <55.00%> (-0.65%)` | :arrow_down: | | [src/grid\_search.jl](https://codecov.io/gh/cstjean/ScikitLearn.jl/pull/119?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-c3JjL2dyaWRfc2VhcmNoLmps) | `81.61% <0.00%> (+1.18%)` | :arrow_up: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

cstjean commented 1 year ago

If the docs CI fails unrelatedly, feel free to merge anyway...

cstjean commented 1 year ago

Might be worth squash merging when the time comes...

OkonSamuel commented 1 year ago

Might be worth squash merging when the time comes...

Sure. I planned on doing that

ablaom commented 1 year ago

@OkonSamuel Do you mean to leave this as draft or are you ready for review?

OkonSamuel commented 1 year ago

@OkonSamuel Do you mean to leave this as draft or are you ready for review?

@ablaom Okay. I'm ready for a review.

ablaom commented 1 year ago

I can confirm that @sk_import appears to work on macOS with PYTHON="", julia=1.6.5. Tests fail, but this appears to be orthogonal to the objective of the present PR (the same test fails under "master"). Details below.

I'm not sure I understand why the macOS julia 1.6 pass on CI. Perhaps @OkonSamuel can comment.

``` Testing ../examples/Pipeline_PCA_Logistic.ipynb WARNING: replacing module Testing. Notebook examples: Error During Test at /Users/anthony/Julia/ScikitLearn/test/runtests.jl:63 Got exception outside of a @test LoadError: PyError ($(Expr(:escape, :(ccall(#= /Users/anthony/.julia/packages/PyCall/twYvK/src/pyfncall.jl:43 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) AttributeError("'str' object has no attribute 'decode'") File "/Users/anthony/anaconda2/envs/py37/lib/python3.7/site-packages/sklearn/linear_model/_logistic.py", line 1601, in fit for class_, warm_start_coef_ in zip(classes_, warm_start_coef)) File "/Users/anthony/anaconda2/envs/py37/lib/python3.7/site-packages/joblib/parallel.py", line 1085, in __call__ if self.dispatch_one_batch(iterator): File "/Users/anthony/anaconda2/envs/py37/lib/python3.7/site-packages/joblib/parallel.py", line 901, in dispatch_one_batch self._dispatch(tasks) File "/Users/anthony/anaconda2/envs/py37/lib/python3.7/site-packages/joblib/parallel.py", line 819, in _dispatch job = self._backend.apply_async(batch, callback=cb) File "/Users/anthony/anaconda2/envs/py37/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async result = ImmediateResult(func) File "/Users/anthony/anaconda2/envs/py37/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 597, in __init__ self.results = batch() File "/Users/anthony/anaconda2/envs/py37/lib/python3.7/site-packages/joblib/parallel.py", line 289, in __call__ for func, args, kwargs in self.items] File "/Users/anthony/anaconda2/envs/py37/lib/python3.7/site-packages/joblib/parallel.py", line 289, in for func, args, kwargs in self.items] File "/Users/anthony/anaconda2/envs/py37/lib/python3.7/site-packages/sklearn/linear_model/_logistic.py", line 940, in _logistic_regression_path extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG) File "/Users/anthony/anaconda2/envs/py37/lib/python3.7/site-packages/sklearn/utils/optimize.py", line 243, in _check_optimize_result ).format(solver, result.status, result.message.decode("latin1")) Stacktrace: [1] pyerr_check @ ~/.julia/packages/PyCall/twYvK/src/exception.jl:75 [inlined] [2] pyerr_check @ ~/.julia/packages/PyCall/twYvK/src/exception.jl:79 [inlined] [3] _handle_error(msg::String) @ PyCall ~/.julia/packages/PyCall/twYvK/src/exception.jl:96 [4] macro expansion @ ~/.julia/packages/PyCall/twYvK/src/exception.jl:110 [inlined] [5] #107 @ ~/.julia/packages/PyCall/twYvK/src/pyfncall.jl:43 [inlined] [6] disable_sigint @ ./c.jl:458 [inlined] [7] __pycall! @ ~/.julia/packages/PyCall/twYvK/src/pyfncall.jl:42 [inlined] [8] _pycall!(ret::PyCall.PyObject, o::PyCall.PyObject, args::Tuple{Matrix{Float64}, Vector{Int64}}, nargs::Int64, kw::Ptr{Nothing}) @ PyCall ~/.julia/packages/PyCall/twYvK/src/pyfncall.jl:29 [9] _pycall!(ret::PyCall.PyObject, o::PyCall.PyObject, args::Tuple{Matrix{Float64}, Vector{Int64}}, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) @ PyCall ~/.julia/packages/PyCall/twYvK/src/pyfncall.jl:11 [10] (::PyCall.PyObject)(::Matrix{Float64}, ::Vararg{Any, N} where N; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) @ PyCall ~/.julia/packages/PyCall/twYvK/src/pyfncall.jl:86 [11] (::PyCall.PyObject)(::Matrix{Float64}, ::Vararg{Any, N} where N) @ PyCall ~/.julia/packages/PyCall/twYvK/src/pyfncall.jl:86 [12] fit!(::PyCall.PyObject, ::Matrix{Float64}, ::Vararg{Any, N} where N; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) @ ScikitLearn.Skcore ~/Julia/ScikitLearn/src/Skcore.jl:102 [13] fit!(::PyCall.PyObject, ::Matrix{Float64}, ::Vector{Int64}) @ ScikitLearn.Skcore ~/Julia/ScikitLearn/src/Skcore.jl:102 [14] fit!(pip::Pipeline, X::Matrix{Float64}, y::Vector{Int64}) @ ScikitLearn.Skcore ~/Julia/ScikitLearn/src/pipeline.jl:70 [15] _fit_and_score(estimator::Pipeline, X::Matrix{Float64}, y::Vector{Int64}, scorer::Function, train::Vector{Int64}, test::Vector{Int64}, verbose::Int64, parameters::Dict{Symbol, Any}, fit_params::Dict{Symbol, Any}; return_train_score::Bool, return_parameters::Bool, error_score::String) @ ScikitLearn.Skcore ~/Julia/ScikitLearn/src/cross_validation.jl:574 [16] (::ScikitLearn.Skcore.var"#101#102"{GridSearchCV, Pipeline, Matrix{Float64}, Vector{Int64}, Dict{Any, Any}})(::Tuple{Vector{Int64}, Vector{Int64}}) @ ScikitLearn.Skcore ./none:0 [17] iterate @ ./generator.jl:47 [inlined] [18] collect(itr::Base.Generator{Vector{Tuple{Vector{Int64}, Vector{Int64}}}, ScikitLearn.Skcore.var"#101#102"{GridSearchCV, Pipeline, Matrix{Float64}, Vector{Int64}, Dict{Any, Any}}}) @ Base ./array.jl:681 [19] fit_cv_rotations(self::GridSearchCV, estimator::Pipeline, X::Matrix{Float64}, y::Vector{Int64}, parameters::Dict{Any, Any}, cv::Vector{Tuple{Vector{Int64}, Vector{Int64}}}) @ ScikitLearn.Skcore ~/Julia/ScikitLearn/src/grid_search.jl:244 [20] _fit!(self::GridSearchCV, X::Matrix{Float64}, y::Vector{Int64}, parameter_iterable::ScikitLearn.Skcore.ParameterGrid) @ ScikitLearn.Skcore ~/Julia/ScikitLearn/src/grid_search.jl:281 [21] fit!(self::GridSearchCV, X::Matrix{Float64}, y::Vector{Int64}) @ ScikitLearn.Skcore ~/Julia/ScikitLearn/src/grid_search.jl:563 [22] top-level scope @ ~/Julia/ScikitLearn/examples/Pipeline_PCA_Logistic.ipynb:In[2]:35 [23] eval @ ./boot.jl:360 [inlined] [24] include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String) @ Base ./loading.jl:1116 [25] include_string @ ./loading.jl:1126 [inlined] [26] my_include_string(m::Module, s::String, path::String, prev::String, softscope::Bool) @ NBInclude ~/.julia/packages/NBInclude/MxvbF/src/NBInclude.jl:30 [27] #2 @ ~/.julia/packages/NBInclude/MxvbF/src/NBInclude.jl:93 [inlined] [28] task_local_storage(body::NBInclude.var"#2#3"{Bool, Module, String, String, String, String}, key::Symbol, val::Bool) @ Base ./task.jl:281 [29] nbinclude(m::Module, path::String; renumber::Bool, counters::UnitRange{Int64}, regex::Regex, anshook::typeof(identity), softscope::Bool) @ NBInclude ~/.julia/packages/NBInclude/MxvbF/src/NBInclude.jl:92 [30] nbinclude(m::Module, path::String) @ NBInclude ~/.julia/packages/NBInclude/MxvbF/src/NBInclude.jl:65 [31] top-level scope @ ~/Julia/ScikitLearn/test/runtests.jl:56 [32] eval @ ./boot.jl:360 [inlined] [33] (::var"#run_examples#1"{Vector{String}})() @ Main ~/Julia/ScikitLearn/test/runtests.jl:54 [34] macro expansion @ ~/Julia/ScikitLearn/test/runtests.jl:64 [inlined] [35] macro expansion @ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined] [36] macro expansion @ ~/Julia/ScikitLearn/test/runtests.jl:64 [inlined] [37] macro expansion @ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined] [38] top-level scope @ ~/Julia/ScikitLearn/test/runtests.jl:12 [39] include(fname::String) @ Base.MainInclude ./client.jl:444 [40] top-level scope @ none:6 [41] eval @ ./boot.jl:360 [inlined] [42] exec_options(opts::Base.JLOptions) @ Base ./client.jl:261 [43] _start() @ Base ./client.jl:485 in expression starting at /Users/anthony/Julia/ScikitLearn/examples/Pipeline_PCA_Logistic.ipynb:In[2]:35 ```
OkonSamuel commented 1 year ago

@ablaom The tests for 1.6 were run with the current long term release for Julia, which is 1.6.7. I suggest trying the code again with Julia v1.6.7

ablaom commented 1 year ago

Mmm. I'm getting the same error for julia 1.8 (this PR)

julia> versioninfo()
Julia Version 1.8.5
Commit 17cfb8e65ea (2023-01-08 06:45 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin21.4.0)
  CPU: 12 × Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 5 on 12 virtual cores
Environment:
  JULIA_LTS_PATH = /Applications/Julia-1.6.app/Contents/Resources/julia/bin/julia
  JULIA_PATH = /Applications/Julia-1.8.app/Contents/Resources/julia/bin/julia
  JULIA_EGLOT_PATH = /Applications/Julia-1.6.app/Contents/Resources/julia/bin/julia
  JULIA_NUM_THREADS = 5
  DYLD_LIBRARY_PATH = /usr/local/homebrew/Cellar/libomp/9.0.1/lib/
  JULIA_NIGHTLY_PATH = /Applications/Julia-1.8.app/Contents/Resources/julia/bin/julia
ablaom commented 1 year ago

okay it seems my non-julia conda install of python is being used, despite the PYTHON flag. i'll investigate

OkonSamuel commented 1 year ago

okay it seems my non-julia conda install of python is being used, despite the PYTHON flag. i'll investigate

after changing the PYTHON enviroment variable. You will need to run] build PyCall or the change won't take effect.

ablaom commented 1 year ago

@OkonSamuel Thanks, my bad. Both 1.6 and 1.8 pass tests locally on my Mac, after I remember to build after changing my env 😳

cstjean commented 1 year ago

Thank you for checking @ablaom ! I trust @OkonSamuel on this one, LGTM. Merge?