Open Jgmedina95 opened 2 years ago
Hi @Jgmedina95,
Thanks for raising this! So this should be because @inbounds
is used throughout the evaluation to have faster compute. I make an in-bounds assumption since trees made during the search should never have different features than available in the dataset. However, as you raise, I don't think it hurts to check before running the evaluation, so maybe we could do that. (it could be that the user could save their search state, then re-run with a different dataset, and be surprised when trees access undefined memory?)
I just looked at the code, and it seems like sometimes @inbounds
is actually not used, such as:
https://github.com/MilesCranmer/SymbolicRegression.jl/blob/6075f13c3e8fbb8b16686a6f7c1157f0235174ee/src/EvaluateEquation.jl#L163
Whereas, @inbounds
is used in the functions which fuse operators together, like here:
https://github.com/MilesCranmer/SymbolicRegression.jl/blob/6075f13c3e8fbb8b16686a6f7c1157f0235174ee/src/EvaluateEquation.jl#L189
So perhaps for some trees, this would raise an error, and sometimes it would not. Will think more whether we should add a bounds check.
Best, Miles
Hi, im working on a process that uses the equation returned in the Pareto Frontier. I was playing around with it and found the following: Setting a tree like: and evaluating with a dataset with 6 features, gives no problem, as expected.
but if i create a tree that uses more features than in the original dataset like:
I expected an error, but the function eval_tree_array gave an output and completion ==true. I've realized this will not impact what I'm doing as trees made by the program will never (right?) have more features than the dataset, but I suppose its interesting enough of a bug to share.