Closed nickotto closed 7 months ago
While trying to print out some of the evaluated individual columns, I noticed a simple bug in how str was defined from the graph individuals. it normally works by exporting a pipeline then printing the string for that, but that can fail if the hyperparameters are invalid. Just added a try-except block to catch those cases. (This will be changed again in the next update, so I didn't want to make a whole new PR for that.)
There is an edge case where if an individual is created, but its evaluation is incomplete, the value in Eval Errors column in np.nan instead of None. This can happen if the global timeout is triggered (max_time_seconds). We don't want to label those as "timeout" since that should be reserved for going over max_eval_time_seconds.
But I'm not sure if we can change the default missing value in pandas to None, and it doesn't allow us to add nans at the same time we add the strings for the error. I think we just leave it as is for now?
[please review the Contribution Guidelines prior to submitting your pull request. go ahead and delete this line if you've already reviewed said guidelines.]
What does this PR do?
Creates a new column in evaluated individuals in "Eval Error". This is to keep the columns for scores as floats without strings. All evaluation errors would be in the "Eval Error" column which is an object dtype. This would resolve any incompatibilities with pandas 2.0+. Updated for both steady state and base estimator version of tpot2
Where should the reviewer start?
Simply run tpot2 with single and multiple objectives. There should be no warning about incompatibility as we saw before when pandas was above 2.0+.
How should this PR be tested?
Run across different python versions.