Open TreeOfLearning opened 3 months ago
Hi @TreeOfLearning, thanks for creating an issue! While I understand you can't provide the data, please try to provide a synthetic data example that reproduces the issue, otherwise it would be much harder for us to verify if this issue still exists and how to fix it.
Please also provide the training logs of the models for reference.
Bug Report Checklist
Describe the bug
I am getting the error
ValueError: Input contains infinity or a value too large for dtype('float32')
when callingpredict
on a successfully trainedTabularPredictor
. I have triple checked that the data I am providing is sanitised in such a way that there are no values above the limit of a float32, no infinity values, and no nans. Therefore, the problem must be within the predictor somewhere.Based on the error, it looks like it could be an issue with the y scaler. Could it be that the scaler is being fit to the training data but that then produces values too large for the predictor to actually use? If this is the case I'd expect those values to be clipped to an acceptable range rather than just failing to predict.
For what it's worth, if I use a different subset of data, I do not have this issue, so clearly there are some values causing an issue.
Expected behavior
I expect to be able to train on a subset of my data and then predict on the remaining data, and for that prediction to succeed.
To Reproduce
I can't provide the data as it is proprietary and sensitive. However, here is the code with which I am training the predictor and then calling predict:
Screenshots / Logs
Installed Versions