MilesCranmer / PySR

High-Performance Symbolic Regression in Python and Julia
https://astroautomata.com/PySR
Apache License 2.0
2.11k stars 198 forks source link

BoundsError in custom eval_tree_array #337

Closed DenisSvirin closed 1 year ago

DenisSvirin commented 1 year ago

What happened?

I tried to run '_eval_tree_array' with array that has different dimensions then cX in 'eval_tree_array'

And if i understand correctly after test runs l've got an error ''

Version

0.18.0

Operating System

macOS

Interface

Jupyter Notebook

Relevant log output

BoundsError: attempt to access 1×11 Matrix{Float32} at index [4, 1:11]

Extra Info

my version of 'eval_tree_array' :

function eval_tree_array(
    tree::Node{T}, cX::AbstractMatrix{T}, operators::OperatorEnum; turbo::Bool=false
)::Tuple{AbstractVector{T},Bool} where {T<:Number}
    if turbo
        @assert T in (Float32, Float64)
    end
    r = T.( [6.332061275761631 5.051237472144821 2.7 4.2690748412273125 3.3068111527572905 5.4 4.676537180435969 6.037383539249433 5.727564927611035 3.8183$
    result, finished = _eval_tree_array(
        tree, r, operators, (turbo ? Val(true) : Val(false))
    )

    @return_on_false finished result
    @return_on_nonfinite_array result  
    return result, finished
end
MilesCranmer commented 1 year ago

Could you share your full code, so I can reproduce it?

DenisSvirin commented 1 year ago

I've changed 'eval_tree_array' in the backend:

function eval_tree_array(
    tree::Node{T}, cX::AbstractMatrix{T}, operators::OperatorEnum; turbo::Bool=false
)::Tuple{AbstractVector{T},Bool} where {T<:Number}
    if turbo
        @assert T in (Float32, Float64)
    end
    r =T.( [6.332061275761631 5.051237472144821 2.7 4.2690748412273125 3.3068111527572905 5.4 4.676537180435969 6.037383539249433 5.727564927611035 3.818376618407357 1.9091883092036785])
    result, finished = _eval_tree_array(
        tree, r, operators, (turbo ? Val(true) : Val(false))
    )
@return_on_false finished result
    @return_on_nonfinite_array result  
    return result, finished
end

and to start i use python:

import numpy as np
from pysr import PySRRegressor
r = np.array([ 2.7       ,  3.        ,  3.2       ,  3.4       ,  3.48621994,
        3.52253473,  3.55884952,  3.59516431,  3.6314791 ,  3.66779389,
        3.70410868,  3.74042347,  3.77673827,  3.81305306,  4.        ,
        4.5       ,  5.        ,  5.5       ,  5.73684211,  5.97368421,
        6.        ,  6.21052632,  6.44736842,  6.68421053,  6.92105263,
        7.15789474,  7.39473684,  7.63157895,  7.86842105,  8.        ,
        8.10526316,  8.34210526,  8.57894737,  8.81578947,  9.05263158,
        9.28947368,  9.52631579,  9.76315789, 10.        , 35.        ]) 
e = np.array([ 20.        ,  -2.3341443 , -10.378765  , -13.874788  ,
       -14.534859  , -14.705036  , -14.821358  , -14.8896    ,
       -14.934813  , -14.90294   , -14.857485  , -14.782236  ,
       -14.681216  , -14.557151  , -13.653582  , -10.439137  ,
        -7.4652775 ,  -5.1739723 ,  -4.26172661,  -3.58361976,
        -3.5250227 ,  -3.01341022,  -2.53392988,  -2.1307423 ,
        -1.79170813,  -1.50661955,  -1.26689299,  -1.06531065,
        -0.89580319,  -0.78888653,  -0.753267  ,  -0.63341053,
        -0.53262508,  -0.44787616,  -0.37661211,  -0.31668728,
        -0.26629742,  -0.22392536,  -0.18829536,   0.        ])
model = PySRRegressor(
    model_selection="best",
    niterations=40,
    binary_operators=["+", "*", "-", "^"],
    loss="loss(x, y) = (x - y)^2")
model.fit(r.reshape(-1, 1), e)
MilesCranmer commented 1 year ago

Okay I think I might understand the issue. Does the error come during precompilation for you? Or during the search itself?

Can you try changing the loss function, rather than the evaluation code?

See https://astroautomata.com/PySR/examples/#9-custom-objectives ^ this is how you should do custom objectives, rather than changing the evaluation function itself.

DenisSvirin commented 1 year ago

precompilation works fine, it occurs during search.

Oh, then i'll try it this way.

DenisSvirin commented 1 year ago

It seems, that everything works fine this way

MilesCranmer commented 1 year ago

So the bug is fixed with this change?

DenisSvirin commented 1 year ago

yes, if i do it with full_objective

MilesCranmer commented 1 year ago

Great!

It's probably because there are other pieces of code that rely on eval_tree_array. It is a generic function for evaluating an expression and shouldn't be overloaded. eval_loss is for these purposes (which is what full_objective overrides).