MilesCranmer / PySR

High-Performance Symbolic Regression in Python and Julia
Apache License 2.0
2.11k stars 198 forks source link

BoundsError in custom eval_tree_array #337

Closed DenisSvirin closed 1 year ago

DenisSvirin commented 1 year ago

What happened?

I tried to run '_eval_tree_array' with array that has different dimensions then cX in 'eval_tree_array'

And if i understand correctly after test runs l've got an error ''



Operating System



Jupyter Notebook

Relevant log output

BoundsError: attempt to access 1×11 Matrix{Float32} at index [4, 1:11]

Extra Info

my version of 'eval_tree_array' :

function eval_tree_array(
    tree::Node{T}, cX::AbstractMatrix{T}, operators::OperatorEnum; turbo::Bool=false
)::Tuple{AbstractVector{T},Bool} where {T<:Number}
    if turbo
        @assert T in (Float32, Float64)
    r = T.( [6.332061275761631 5.051237472144821 2.7 4.2690748412273125 3.3068111527572905 5.4 4.676537180435969 6.037383539249433 5.727564927611035 3.8183$
    result, finished = _eval_tree_array(
        tree, r, operators, (turbo ? Val(true) : Val(false))

    @return_on_false finished result
    @return_on_nonfinite_array result  
    return result, finished
MilesCranmer commented 1 year ago

Could you share your full code, so I can reproduce it?

DenisSvirin commented 1 year ago

I've changed 'eval_tree_array' in the backend:

function eval_tree_array(
    tree::Node{T}, cX::AbstractMatrix{T}, operators::OperatorEnum; turbo::Bool=false
)::Tuple{AbstractVector{T},Bool} where {T<:Number}
    if turbo
        @assert T in (Float32, Float64)
    r =T.( [6.332061275761631 5.051237472144821 2.7 4.2690748412273125 3.3068111527572905 5.4 4.676537180435969 6.037383539249433 5.727564927611035 3.818376618407357 1.9091883092036785])
    result, finished = _eval_tree_array(
        tree, r, operators, (turbo ? Val(true) : Val(false))
@return_on_false finished result
    @return_on_nonfinite_array result  
    return result, finished

and to start i use python:

import numpy as np
from pysr import PySRRegressor
r = np.array([ 2.7       ,  3.        ,  3.2       ,  3.4       ,  3.48621994,
        3.52253473,  3.55884952,  3.59516431,  3.6314791 ,  3.66779389,
        3.70410868,  3.74042347,  3.77673827,  3.81305306,  4.        ,
        4.5       ,  5.        ,  5.5       ,  5.73684211,  5.97368421,
        6.        ,  6.21052632,  6.44736842,  6.68421053,  6.92105263,
        7.15789474,  7.39473684,  7.63157895,  7.86842105,  8.        ,
        8.10526316,  8.34210526,  8.57894737,  8.81578947,  9.05263158,
        9.28947368,  9.52631579,  9.76315789, 10.        , 35.        ]) 
e = np.array([ 20.        ,  -2.3341443 , -10.378765  , -13.874788  ,
       -14.534859  , -14.705036  , -14.821358  , -14.8896    ,
       -14.934813  , -14.90294   , -14.857485  , -14.782236  ,
       -14.681216  , -14.557151  , -13.653582  , -10.439137  ,
        -7.4652775 ,  -5.1739723 ,  -4.26172661,  -3.58361976,
        -3.5250227 ,  -3.01341022,  -2.53392988,  -2.1307423 ,
        -1.79170813,  -1.50661955,  -1.26689299,  -1.06531065,
        -0.89580319,  -0.78888653,  -0.753267  ,  -0.63341053,
        -0.53262508,  -0.44787616,  -0.37661211,  -0.31668728,
        -0.26629742,  -0.22392536,  -0.18829536,   0.        ])
model = PySRRegressor(
    binary_operators=["+", "*", "-", "^"],
    loss="loss(x, y) = (x - y)^2"), 1), e)
MilesCranmer commented 1 year ago

Okay I think I might understand the issue. Does the error come during precompilation for you? Or during the search itself?

Can you try changing the loss function, rather than the evaluation code?

See ^ this is how you should do custom objectives, rather than changing the evaluation function itself.

DenisSvirin commented 1 year ago

precompilation works fine, it occurs during search.

Oh, then i'll try it this way.

DenisSvirin commented 1 year ago

It seems, that everything works fine this way

MilesCranmer commented 1 year ago

So the bug is fixed with this change?

DenisSvirin commented 1 year ago

yes, if i do it with full_objective

MilesCranmer commented 1 year ago


It's probably because there are other pieces of code that rely on eval_tree_array. It is a generic function for evaluating an expression and shouldn't be overloaded. eval_loss is for these purposes (which is what full_objective overrides).