MilesCranmer / PySR

High-Performance Symbolic Regression in Python and Julia
https://ai.damtp.cam.ac.uk/pysr
Apache License 2.0
2.46k stars 217 forks source link

[BUG]: MethodError when creating custom objective #411

Closed villrv closed 1 year ago

villrv commented 1 year ago

What happened?

I am unable to run the custom objective function example. When I run model.fit, I get the following error (truncated to what I think is the relevant bit):

RuntimeError: <PyCall.jlwrap (in a Julia function called from Python)
JULIA: MethodError: no method matching _method_instances(::Type{typeof(my_custom_objective)}, ::Type{Tuple{Node{Float32}, Dataset{Float32, Float32, Matrix{Float32}, Vector{Float32}, Nothing, NamedTuple{(), Tuple{}}, Nothing, Nothing, Nothing, Nothing}, Options{Int64, DynamicExpressions.OperatorEnumModule.OperatorEnum, false, Optim.Options{Float64, Nothing}, StatsBase.Weights{Float64, Float64, Vector{Float64}}}, Nothing}})
The applicable method may be too new: running in world age 44885, while current world is 54430.

This error appears in Jupyter notebook and pure python, so I don't believe it is related to my notebook. Running pysr without the custom objective works fine.

Version

0.16.2

Operating System

macOS

Package Manager

Conda

Interface

Jupyter Notebook

Relevant log output

Compiling Julia backend...
/opt/homebrew/Caskroom/miniforge/base/envs/myenv/lib/python3.8/site-packages/pysr/julia_helpers.py:208: UserWarning: Your system's Python library is static (e.g., conda), so precompilation will be turned off. For a dynamic library, try using `pyenv` and installing with `--enable-shared`: https://github.com/pyenv/pyenv/blob/master/plugins/python-build/README.md#building-with---enable-shared.
  warnings.warn(
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/Caskroom/miniforge/base/envs/myenv/lib/python3.8/site-packages/pysr/sr.py", line 1970, in fit
    self._run(X, y, mutated_params, weights=weights, seed=seed)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/myenv/lib/python3.8/site-packages/pysr/sr.py", line 1800, in _run
    self.raw_julia_state_ = SymbolicRegression.equation_search(
RuntimeError: <PyCall.jlwrap (in a Julia function called from Python)
JULIA: MethodError: no method matching _method_instances(::Type{typeof(my_custom_objective)}, ::Type{Tuple{Node{Float32}, Dataset{Float32, Float32, Matrix{Float32}, Vector{Float32}, Nothing, NamedTuple{(), Tuple{}}, Nothing, Nothing, Nothing, Nothing}, Options{Int64, DynamicExpressions.OperatorEnumModule.OperatorEnum, false, Optim.Options{Float64, Nothing}, StatsBase.Weights{Float64, Float64, Vector{Float64}}}, Nothing}})
The applicable method may be too new: running in world age 44848, while current world is 54489.

Closest candidates are:
  _method_instances(::Any, ::Any) (method too new to be called from this world context.)
   @ Tricks ~/.julia/packages/Tricks/7oAyo/src/Tricks.jl:150

Stacktrace:
  [1] #s1771#1
    @ ~/.julia/packages/Tricks/7oAyo/src/Tricks.jl:16 [inlined]
  [2] var"#s1771#1"(T::Any, ::Any, f::Any, t::Any)
    @ Tricks ./none:0
  [3] (::Core.GeneratedFunctionStub)(::Any, ::Vararg{Any})
    @ Core ./boot.jl:602
  [4] evaluator(f::typeof(my_custom_objective), tree::Node{Float32}, dataset::Dataset{Float32, Float32, Matrix{Float32}, Vector{Float32}, Nothing, NamedTuple{(), Tuple{}}, Nothing, Nothing, Nothing, Nothing}, options::Options{Int64, DynamicExpressions.OperatorEnumModule.OperatorEnum, false, Optim.Options{Float64, Nothing}, StatsBase.Weights{Float64, Float64, Vector{Float64}}}, idx::Nothing)
    @ SymbolicRegression.LossFunctionsModule ~/.julia/packages/SymbolicRegression/FgFra/src/LossFunctions.jl:78
  [5] eval_loss(tree::Node{Float32}, dataset::Dataset{Float32, Float32, Matrix{Float32}, Vector{Float32}, Nothing, NamedTuple{(), Tuple{}}, Nothing, Nothing, Nothing, Nothing}, options::Options{Int64, DynamicExpressions.OperatorEnumModule.OperatorEnum, false, Optim.Options{Float64, Nothing}, StatsBase.Weights{Float64, Float64, Vector{Float64}}}; regularization::Bool, idx::Nothing)
    @ SymbolicRegression.LossFunctionsModule ~/.julia/packages/SymbolicRegression/FgFra/src/LossFunctions.jl:105
  [6] eval>>> print(model)
PySRRegressor.equations_ = None
_loss
    @ ~/.julia/packages/SymbolicRegression/FgFra/src/LossFunctions.jl:94 [inlined]
  [7] update_baseline_loss!(dataset::Dataset{Float32, Float32, Matrix{Float32}, Vector{Float32}, Nothing, NamedTuple{(), Tuple{}}, Nothing, Nothing, Nothing, Nothing}, options::Options{Int64, DynamicExpressions.OperatorEnumModule.OperatorEnum, false, Optim.Options{Float64, Nothing}, StatsBase.Weights{Float64, Float64, Vector{Float64}}})
    @ SymbolicRegression.LossFunctionsModule ~/.julia/packages/SymbolicRegression/FgFra/src/LossFunctions.jl:202
  [8] _equation_search(#unused#::Val{:multithreading}, #unused#::Val{1}, datasets::Vector{Dataset{Float32, Float32, Matrix{Float32}, Vector{Float32}, Nothing, NamedTuple{(), Tuple{}}, Nothing, Nothing, Nothing, Nothing}}, niterations::Int64, options::Options{Int64, DynamicExpressions.OperatorEnumModule.OperatorEnum, false, Optim.Options{Float64, Nothing}, StatsBase.Weights{Float64, Float64, Vector{Float64}}}, numprocs::Nothing, procs::Nothing>>> 
, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing, verbosity::Int64, progress::Bool, #unused#::Val{true})
    @ SymbolicRegression ~/.julia/packages/SymbolicRegression/FgFra/src/SymbolicRegression.jl:572
  [9] equation_search(datasets::Vector{Dataset{Float32, Float32, Matrix{Float32}, Vector{Float32}, Nothing, NamedTuple{(), Tuple{}}, Nothing, Nothing, Nothing, Nothing}}; niterations::Int64, options::Options{Int64, DynamicExpressions.OperatorEnumModule.OperatorEnum, false, Optim.Options{Float64, Nothing}, StatsBase.Weights{Float64, Float64, Vector{Float64}}}, parallelism::String, numprocs::Nothing, procs::Nothing, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing, return_state::Bool, verbosity::Int64, progress::Bool, v_dim_out::Val{1})
    @ SymbolicRegression ~/.julia/packages/SymbolicRegression/FgFra/src/SymbolicRegression.jl:507
 [10] equation_search(X::Matrix{Float32}, y::Matrix{Float32}; niterations::Int64, weights::Nothing, options::Options{Int64, DynamicExpressions.OperatorEnumModule.OperatorEnum, false, Optim.Options{Float64, Nothing}, StatsBase.Weights{Float64, Float64, Vector{Float64}}}, variable_names::Vector{String}, display_variable_names::Vector{String}, y_variable_names::Nothing, parallelism::String, numprocs::Nothing, procs::Nothing, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing, return_state::Bool, loss_type::Type{Nothing}, verbosity::Int64, progress::Bool, X_units::Nothing, y_units::Nothing, v_dim_out::Val{1}, multithreaded::Nothing, varMap::Nothing)
    @ SymbolicRegression ~/.julia/packages/SymbolicRegression/FgFra/src/SymbolicRegression.jl:385
 [11] equation_search
    @ ~/.julia/packages/SymbolicRegression/FgFra/src/SymbolicRegression.jl:330 [inlined]
 [12] #equation_search#24
    @ ~/.julia/packages/SymbolicRegression/FgFra/src/SymbolicRegression.jl:414 [inlined]
 [13] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Any, NTuple{15, Symbol}, NamedTuple{(:weights, :niterations, :variable_names, :display_variable>>> 
_names, :y_variable_names, :X_units, :y_units, :options, :numprocs, :parallelism, :saved_state, :return_state, :addprocs_function, :progress, :verbosity), Tuple{Nothing, Int64, Vector{String}, Vector{String}, Nothing, Nothing, Nothing, Options{Int64, DynamicExpressions.OperatorEnumModule.OperatorEnum, false, Optim.Options{Float64, Nothing}, StatsBase.Weights{Float64, Float64, Vector{Float64}}}, Nothing, String, Nothing, Bool, Nothing, Bool, Int64}}})
    @ Base ./essentials.jl:818
 [14] _pyjlwrap_call(f::Function, args_::Ptr{PyCall.PyObject_struct}, kw_::Ptr{PyCall.PyObject_struct})
    @ PyCall ~/.julia/packages/PyCall/ilqDX/src/callback.jl:32
 [15] pyjlwrap_call(self_::Ptr{PyCall.PyObject_struct}, args_::Ptr{PyCall.PyObject_struct}, kw_::Ptr{PyCall.PyObject_struct})
    @ PyCall ~/.julia/packages/PyCall/ilqDX/src/callback.jl:44>

Extra Info

my Julia version is 1.9.2. It has PyCall and Symbolic Regression installed.

Here are my versions for python requirements: sympy - 1.12 pandas - 2.0.2 numpy - 1.24.3 scikit_learn - 1.2.2 julia - 0.6.0 (also tried 0.6.1) click - 8.1.3 setuptools - 68.0.0

MilesCranmer commented 1 year ago

Thanks very much for making this bug report, it is quite helpful.

Could you include the exact Python code you are running? I do have a unittest for this so it is surprising to see this error https://github.com/MilesCranmer/PySR/blob/d38be42d5c71e07f314db5680453d1b54955c050/pysr/test/test.py#L77-L98 . I wonder if it could be some race condition.

villrv commented 1 year ago

Yes, my code is below. I also tried switching out my objective function with the one in your unit test, with the same error.

import numpy as np

X = 2 * np.random.randn(10000, 5)
y = 2.5382 * np.cos(X[:, 3]) + 1/X[:, 0] ** 2 - 0.5

from pysr import PySRRegressor

objective = """
function my_custom_objective(tree, dataset::Dataset{T,L}, options) where {T,L}
    # Require root node to be binary, so we can split it,
    # otherwise return a large loss:
    tree.degree != 2 && return L(Inf)
    P = tree.l
    Q = tree.r
    # Evaluate numerator:
    P_prediction, flag = eval_tree_array(P, dataset.X, options)
    !flag && return L(Inf)
    # Evaluate denominator:
    Q_prediction, flag = eval_tree_array(Q, dataset.X, options)
    !flag && return L(Inf)
    # Impose functional form:
    prediction = P_prediction ./ Q_prediction
    diffs = prediction .- dataset.y
    return sum(diffs .^ 2) / length(diffs)
end
"""

model = PySRRegressor(
    niterations=100,
    binary_operators=["*", "+", "-"],
    full_objective=objective,
)

model.fit(X, y)
print(model)
villrv commented 1 year ago

Actually I want to add that, even without the custom objective function, I occasionally get the following error:

BlockingIOError: [Errno 35] write could not complete without blocking

I don't know if this is related.

MilesCranmer commented 1 year ago

I can't reproduce the error on my machine for some reason. Does the error come up randomly or is it every time you run it? And does the error only appear after you have executed some other code first?

MilesCranmer commented 1 year ago

Wait, I was just able to reproduce it. I reproduced it by:

  1. Using Julia 1.9.2 (before I was using Julia 1.10 - where the error goes away), and
  2. Running in normal Python rather than IPython (the BlockingIOError seems to be random, depending on which thread runs into the error first)

Will try to fix it soon

MilesCranmer commented 1 year ago

Okay this should be fixed by https://github.com/MilesCranmer/SymbolicRegression.jl/pull/258. Will be ready in PySR v0.16.3 in maybe a few hours after all the CI testing finishes

MilesCranmer commented 1 year ago

Fixed by #413

MilesCranmer commented 1 year ago

The Conda version should get released in the next ~10 hours or so. Let me know if there are further issues!