Autogluon 0.7.0 vs Autogluon 1.1.0 performance degradation on simple regression task

Bug Report Checklist

[x] I provided code that demonstrates a minimal reproducible example.
[x] I confirmed bug exists on the latest mainline of AutoGluon via source install.
[x] I confirmed bug exists on the latest stable version of AutoGluon.

Describe the bug When running the regression task multiple times with the equation y = 2*x+5, Autogluon 1.1.0 consistently performs worse compared to Autogluon 0.7.0.

Expected behavior I expected Autogluon 1.1.0 to perform at least as well as Autogluon 0.7.0 on this simple mathematical function regression task. (with a high R2_score)

To Reproduce I've prepared code snippets to reproduce the issue. Please find them below. Generate dataset + run AutoGluon Tabular:

import numpy as np
import pandas as pd
num_samples = 20000
main_x_min = 0
main_x_max = 100
few_x_min = 2000
few_x_max = 10000
few_samples_ratio = 0.0002  # 0.2% of samples above 2000

# Generate most x values between 0 and 100
main_x_samples = int(num_samples * (1 - few_samples_ratio))
x_main = np.random.uniform(main_x_min, main_x_max, main_x_samples)
x_main[-1] = -1.0
# Generate a few x values above 2000
few_x_samples = num_samples - main_x_samples
x_few = np.random.uniform(few_x_min, few_x_max, few_x_samples)

# Combine both ranges of x values
x = np.concatenate((x_main, x_few))

# Increase noise for x and generate noisy x values
x_noise = np.random.normal(1, 0.0, num_samples)  # smaller noise for x
x_noisy = x * x_noise

# Generate y values based on the function y = 2x + 5 with increased noise
y_noise = np.random.normal(1, 0.0, num_samples)  # smaller noise for y
y = (2 * x_noisy + 5) * y_noise
data = pd.DataFrame({'x': x_noisy, 'y': y})
data.to_csv('regression_dataset.csv', index=False)

from autogluon.tabular import TabularDataset, TabularPredictor
from autogluon.tabular import __version__
print("Autogluon tabular version:",__version__)

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import pandas as pd
dataset = pd.read_csv('regression_dataset.csv')
training_dataset, non_training_dataset = train_test_split(dataset, test_size=0.3)
# Load training and non-training datasets
label = 'y'
train_data = TabularDataset(training_dataset)
test_data = TabularDataset(non_training_dataset)

# Fit the TabularPredictor on the training data
predictor = TabularPredictor(label=label,problem_type='regression').fit(train_data)

# Predictions on test data
predictions_test = predictor.predict(test_data)
test_labels = test_data[label]

# Calculate R2 score for test data
r2_test = r2_score(test_labels, predictions_test)
print("R2 Score test:", r2_test)

# Calculate MSE for test data
mse_test = mean_squared_error(test_labels, predictions_test)
print("MSE test:", mse_test)

# Predictions on train data
predictions_train = predictor.predict(train_data)
train_labels = train_data['y']

# Calculate R2 score for train data
r2_train = r2_score(train_labels, predictions_train)
print("R2 Score train:", r2_train)

# Calculate MSE for train data
mse_train = mean_squared_error(train_labels, predictions_train)
print("MSE train:", mse_train)

Screenshots / Logs AutoGluon 0.7.0's performance

AutoGluon 1.1.0's performance

Installed Versions

INSTALLED VERSIONS
------------------
date                   : 2024-06-08
time                   : 00:34:13.849737
python                 : 3.10.14.final.0
OS                     : Linux
OS-release             : 6.5.0-35-generic
Version                : #35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue May  7 09:00:52 UTC 2
machine                : x86_64
processor              : x86_64
num_cores              : 8
cpu_ram_mb             : 64198.94921875
cuda version           : 12.525.147.05
num_gpus               : 2
gpu_ram_mb             : [7790, 11163]
avail_disk_size_mb     : 34986

accelerate             : 0.21.0
autogluon              : None
autogluon.common       : 1.1.0
autogluon.core         : 1.1.0
autogluon.features     : 1.1.0
autogluon.multimodal   : 1.1.0
autogluon.tabular      : 1.1.0
boto3                  : 1.34.69
catboost               : 1.2.5
defusedxml             : 0.7.1
evaluate               : 0.4.2
fastai                 : 2.7.15
hyperopt               : 0.2.7
imodels                : None
jinja2                 : 3.1.3
jsonschema             : 4.21.1
lightgbm               : 4.1.0
lightning              : 2.1.4
matplotlib             : 3.7.1
networkx               : 3.3
nlpaug                 : 1.1.11
nltk                   : 3.8.1
nptyping               : 2.4.1
numpy                  : 1.24.4
nvidia-ml-py3          : 7.352.0
omegaconf              : 2.2.3
onnxruntime-gpu        : None
openmim                : 0.3.9
pandas                 : 2.0.0
pdf2image              : 1.17.0
Pillow                 : 10.2.0
psutil                 : 5.9.4
pytesseract            : 0.3.10
pytorch-metric-learning: 2.3.0
ray                    : 2.10.0
requests               : 2.28.2
scikit-image           : 0.20.0
scikit-learn           : 1.3.0
scikit-learn-intelex   : None
scipy                  : 1.9.1
seqeval                : 1.2.2
setuptools             : 60.2.0
skl2onnx               : None
tabpfn                 : None
tensorboard            : 2.16.2
text-unidecode         : 1.3
timm                   : 0.9.16
torch                  : 2.1.2+cpu
torchmetrics           : 1.2.1
torchvision            : 0.16.2+cpu
tqdm                   : 4.65.2
transformers           : 4.38.2
vowpalwabbit           : 8.10.1
xgboost                : 2.0.3

autogluon / autogluon

Autogluon 0.7.0 vs Autogluon 1.1.0 performance degradation on simple regression task #4255