First of all, thank you for developing PySR! I have been experimenting with it and wanted to test it on the Nguyen benchmark problems, such as x^3+x^2+x. In order to prevent PySR from finding scalars, I tried setting complexity_of_constants=100. However, I encountered an error during the process:
(20, 1)
(20, 1)
[[ 0.85119328]
[-0.72956365]
[ 0.33353343]
[ 0.95291893]
[ 0.68468416]]
[[ 2.19243833]
[-0.58562036]
[ 0.48188176]
[ 2.72627573]
[ 1.47445129]]
Activating project at `~/anaconda3/envs/myenv/share/pysr/depot/environments/pysr-0.11.5`
WARNING: method definition for TwiceDifferentiable at /home/me/anaconda3/envs/myenv/share/pysr/depot/packages/NLSolversBase/cfJrN/src/objective_types/incomplete.jl:96 declares type variable TH but does not use it.
WARNING: method definition for show at /home/me/anaconda3/envs/myenv/share/pysr/depot/packages/Optim/Zq1jM/src/univariate/printing.jl:7 declares type variable T but does not use it.
WARNING: method definition for best_of_sample at /home/me/anaconda3/envs/myenv/share/pysr/depot/packages/SymbolicRegression/RziqW/src/Population.jl:72 declares type variable T but does not use it.
WARNING: method definition for OneHotArray at /home/me/anaconda3/envs/myenv/share/pysr/depot/packages/MicroCollections/yJPLe/src/onehot.jl:79 declares type variable N but does not use it.
WARNING: method definition for adapt_structure at /home/me/anaconda3/envs/myenv/share/pysr/depot/packages/Transducers/DSfBv/src/partitionby.jl:50 declares type variable inbounds but does not use it.
WARNING: method definition for _foldl_array at /home/me/anaconda3/envs/myenv/share/pysr/depot/packages/Transducers/DSfBv/src/processes.jl:222 declares type variable T but does not use it.
WARNING: method definition for multiplyexistingvar at /home/me/anaconda3/envs/myenv/share/pysr/depot/packages/DynamicPolynomials/juS7t/src/mult.jl:1 declares type variable C but does not use it.
WARNING: method definition for multiplyexistingvar at /home/me/anaconda3/envs/myenv/share/pysr/depot/packages/DynamicPolynomials/juS7t/src/mult.jl:6 declares type variable C but does not use it.
Started!
Traceback (most recent call last):
File "14_test_pysr_srbenchmark.py", line 71, in <module>
model.fit(Input, Output)
File "/home/me/anaconda3/envs/myenv/lib/python3.7/site-packages/pysr/sr.py", line 1750, in fit
self._run(X, y, mutated_params, weights=weights, seed=seed)
File "/home/me/anaconda3/envs/myenv/lib/python3.7/site-packages/pysr/sr.py", line 1620, in _run
addprocs_function=cluster_manager,
RuntimeError: <PyCall.jlwrap (in a Julia function called from Python)
JULIA: TaskFailedException
Stacktrace:
[1] wait
@ ./task.jl:345 [inlined]
[2] fetch
@ ./task.jl:360 [inlined]
[3] _EquationSearch(::SymbolicRegression.CoreModule.ProgramConstantsModule.SRThreaded, datasets::Vector{SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}}; niterations::Int64, options::Options{Tuple{typeof(+), typeof(*), typeof(-), typeof(/)}, Tuple{typeof(cos), typeof(exp), typeof(safe_log), typeof(sin)}, Nothing, Nothing, typeof(loss), Int64}, numprocs::Int64, procs::Nothing, runtests::Bool, saved_state::Nothing, addprocs_function::Nothing)
@ SymbolicRegression ~/anaconda3/envs/myenv/share/pysr/depot/packages/SymbolicRegression/RziqW/src/SymbolicRegression.jl:649
[4] EquationSearch(datasets::Vector{SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}}; niterations::Int64, options::Options{Tuple{typeof(+), typeof(*), typeof(-), typeof(/)}, Tuple{typeof(cos), typeof(exp), typeof(safe_log), typeof(sin)}, Nothing, Nothing, typeof(loss), Int64}, numprocs::Int64, procs::Nothing, multithreading::Bool, runtests::Bool, saved_state::Nothing, addprocs_function::Nothing)
@ SymbolicRegression ~/anaconda3/envs/myenv/share/pysr/depot/packages/SymbolicRegression/RziqW/src/SymbolicRegression.jl:346
[5] EquationSearch(X::Matrix{Float32}, y::Matrix{Float32}; niterations::Int64, weights::Nothing, varMap::Vector{String}, options::Options{Tuple{typeof(+), typeof(*), typeof(-), typeof(/)}, Tuple{typeof(cos), typeof(exp), typeof(safe_log), typeof(sin)}, Nothing, Nothing, typeof(loss), Int64}, numprocs::Int64, procs::Nothing, multithreading::Bool, runtests::Bool, saved_state::Nothing, addprocs_function::Nothing)
@ SymbolicRegression ~/anaconda3/envs/myenv/share/pysr/depot/packages/SymbolicRegression/RziqW/src/SymbolicRegression.jl:295
[6] #EquationSearch#21
@ ~/anaconda3/envs/myenv/share/pysr/depot/packages/SymbolicRegression/RziqW/src/SymbolicRegression.jl:320 [inlined]
[7] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Any, NTuple{8, Symbol}, NamedTuple{(:weights, :niterations, :varMap, :options, :numprocs, :multithreading, :saved_state, :addprocs_function), Tuple{Nothing, Int64, Vector{String}, Options{Tuple{typeof(+), typeof(*), typeof(-), typeof(/)}, Tuple{typeof(cos), typeof(exp), typeof(safe_log), typeof(sin)}, Nothing, Nothing, typeof(loss), Int64}, Int64, Bool, Nothing, Nothing}}})
@ Base ./essentials.jl:731
[8] _pyjlwrap_call(f::Function, args_::Ptr{PyCall.PyObject_struct}, kw_::Ptr{PyCall.PyObject_struct})
@ PyCall ~/anaconda3/envs/myenv/share/pysr/depot/packages/PyCall/ygXW2/src/callback.jl:32
[9] pyjlwrap_call(self_::Ptr{PyCall.PyObject_struct}, args_::Ptr{PyCall.PyObject_struct}, kw_::Ptr{PyCall.PyObject_struct})
@ PyCall ~/anaconda3/envs/myenv/share/pysr/depot/packages/PyCall/ygXW2/src/callback.jl:44
nested task error: TaskFailedException
Stacktrace:
[1] wait
@ ./task.jl:345 [inlined]
[2] fetch
@ ./task.jl:360 [inlined]
[3] (::SymbolicRegression.var"#46#77"{Vector{Vector{Task}}, Int64, Int64})()
@ SymbolicRegression ./task.jl:484
nested task error: UndefVarError: T not defined
Stacktrace:
[1] best_of_sample(pop::Population{Float32}, running_search_statistics::SymbolicRegression.AdaptiveParsimonyModule.RunningSearchStatistics, options::Options{Tuple{typeof(+), typeof(*), typeof(-), typeof(/)}, Tuple{typeof(cos), typeof(exp), typeof(safe_log), typeof(sin)}, Nothing, Nothing, typeof(loss), Int64})
@ SymbolicRegression.PopulationModule ~/anaconda3/envs/myenv/share/pysr/depot/packages/SymbolicRegression/RziqW/src/Population.jl:89
[2] reg_evol_cycle(dataset::SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}, pop::Population{Float32}, temperature::Float32, curmaxsize::Int64, running_search_statistics::SymbolicRegression.AdaptiveParsimonyModule.RunningSearchStatistics, options::Options{Tuple{typeof(+), typeof(*), typeof(-), typeof(/)}, Tuple{typeof(cos), typeof(exp), typeof(safe_log), typeof(sin)}, Nothing, Nothing, typeof(loss), Int64}, record::Dict{String, Any})
@ SymbolicRegression.RegularizedEvolutionModule ~/anaconda3/envs/myenv/share/pysr/depot/packages/SymbolicRegression/RziqW/src/RegularizedEvolution.jl:0
[3] s_r_cycle(dataset::SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}, pop::Population{Float32}, ncycles::Int64, curmaxsize::Int64, running_search_statistics::SymbolicRegression.AdaptiveParsimonyModule.RunningSearchStatistics; verbosity::Int64, options::Options{Tuple{typeof(+), typeof(*), typeof(-), typeof(/)}, Tuple{typeof(cos), typeof(exp), typeof(safe_log), typeof(sin)}, Nothing, Nothing, typeof(loss), Int64}, record::Dict{String, Any})
@ SymbolicRegression.SingleIterationModule ~/anaconda3/envs/myenv/share/pysr/depot/packages/SymbolicRegression/RziqW/src/SingleIteration.jl:37
[4] macro expansion
@ ~/anaconda3/envs/myenv/share/pysr/depot/packages/SymbolicRegression/RziqW/src/SymbolicRegression.jl:573 [inlined]
[5] (::SymbolicRegression.var"#44#75"{SymbolicRegression.CoreModule.ProgramConstantsModule.SRThreaded, Options{Tuple{typeof(+), typeof(*), typeof(-), typeof(/)}, Tuple{typeof(cos), typeof(exp), typeof(safe_log), typeof(sin)}, Nothing, Nothing, typeof(loss), Int64}, Vector{Vector{Task}}, Int64, SymbolicRegression.AdaptiveParsimonyModule.RunningSearchStatistics, Int64, SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}, Int64})()
@ SymbolicRegression ./threadingconstructs.jl:258>
Here's the code snippet I used:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '6'
import time
import numpy as np
# import sympy as sp
# import torch
# import pandas as pd
from pysr import PySRRegressor
from utils.data import get_benchmark_data
np.random.seed(0)
X, Y, use_constant, expression, variables_name = get_benchmark_data('benchmark.csv',
'Nguyen-1',
1000)
Input = X
Output = Y
print(X.shape)
print(Y.shape)
print(Input[:5])
print(Output[:5])
np.random.seed(0)
model = PySRRegressor(
# random_state=0,
# deterministic=True,
# Make a PySR search give the same result every run.
# To use this, you must turn off parallelism (with procs=0, multithreading=False),
# and set random_state to a fixed seed. Default is False.
# procs=0,
# multithreading=False,
niterations=1000, # < Increase me for better results
binary_operators=["+", "*", "-", "/"],
# should_optimize_constants=use_constant,
# complexity_of_constants=100, # to prevent PySR finding scalars
unary_operators=[
"cos",
"exp",
"log",
"sin",
# "inv(x) = 1/x",
# "neg(x) = -x",
# ^ Custom operator (julia syntax)
],
# extra_sympy_mappings={"inv": lambda x: 1 / x,
# "neg": lambda x: -x},
# ^ Define operator for SymPy as well
loss="loss(prediction, target) = (prediction - target)^2",
# ^ Custom loss function (julia syntax)
)
start_time = time.time()
np.random.seed(0)
model.fit(Input, Output)
end_time = time.time()
time_cost = end_time - start_time
print('time_cost',time_cost)
print(model)
Interestingly, when I commented out the line complexity_of_constants=100, the code ran without any errors. Do you have any insights into this issue?
What happened?
Hi there,
First of all, thank you for developing PySR! I have been experimenting with it and wanted to test it on the Nguyen benchmark problems, such as
x^3+x^2+x
. In order to prevent PySR from finding scalars, I tried settingcomplexity_of_constants=100
. However, I encountered an error during the process:Here's the code snippet I used:
Interestingly, when I commented out the line
complexity_of_constants=100,
the code ran without any errors. Do you have any insights into this issue?Version
0.11.5
Operating System
Linux
Package Manager
None
Interface
Script (i.e.,
python my_script.py
)Relevant log output
No response
Extra Info
No response