I'm running into an AttributeError when trying to fit an Autogluon time series model in Databricks. I'm running into the error specifically when calling .fit() on the TimeSeriesPredictor instance. I made no changes to the code upon cloning the repo into Databricks.

Databricks cluster configuration

Policy: Single User
Databricks Runtime Version: 13.3 LTS (includes Apache Spark 3.4.1, Scala 2.12)
Use photon acceleration
Worker type: E8as_v4 (workers=1, 64 GB memory, 8 cores)
Driver type: E8as_v4 (64 GB memory, 8 cores)

Results from running the following command from autogluon.core.utils import show_versions show_versions()

INSTALLED VERSIONS
------------------
date                   : 2024-07-24
time                   : 16:46:40.799450
python                 : 3.10.12.final.0
OS                     : Linux
OS-release             : 5.15.0-1067-azure
Version                : #76~20.04.1-Ubuntu SMP Thu Jun 13 18:00:23 UTC 2024
machine                : x86_64
processor              : x86_64
num_cores              : 8
cpu_ram_mb             : 58770.0
cuda version           : None
num_gpus               : 0
gpu_ram_mb             : []
avail_disk_size_mb     : 213044

accelerate             : 0.21.0
autogluon              : 1.1.1
autogluon.common       : 1.1.1
autogluon.core         : 1.1.1
autogluon.features     : 1.1.1
autogluon.multimodal   : 1.1.1
autogluon.tabular      : 1.1.1
autogluon.timeseries   : 1.1.1
boto3                  : 1.24.28
catboost               : 1.2.5
defusedxml             : 0.7.1
evaluate               : 0.4.2
fastai                 : 2.7.15
gluonts                : 0.15.1
hyperopt               : 0.2.7
imodels                : None
jinja2                 : 3.1.4
joblib                 : 1.2.0
jsonschema             : 4.21.1
lightgbm               : 4.3.0
lightning              : 2.3.3
matplotlib             : 3.5.2
mlforecast             : 0.10.0
networkx               : 3.3
nlpaug                 : 1.1.11
nltk                   : 3.8.1
nptyping               : 2.4.1
numpy                  : 1.24.4
nvidia-ml-py3          : 7.352.0
omegaconf              : 2.2.3
onnxruntime-gpu        : None
openmim                : 0.3.9
optimum                : 1.18.1
optimum-intel          : None
orjson                 : 3.10.6
pandas                 : 2.2.2
pdf2image              : 1.17.0
Pillow                 : 10.4.0
psutil                 : 5.9.0
pytesseract            : 0.3.10
pytorch-lightning      : 2.3.3
pytorch-metric-learning: 2.3.0
ray                    : 2.10.0
requests               : 2.32.3
scikit-image           : 0.20.0
scikit-learn           : 1.4.0
scikit-learn-intelex   : None
scipy                  : 1.9.1
seqeval                : 1.2.2
setuptools             : 63.4.1
skl2onnx               : None
statsforecast          : 1.4.0
tabpfn                 : None
tensorboard            : 2.17.0
text-unidecode         : 1.3
timm                   : 0.9.16
torch                  : 2.3.1
torchmetrics           : 1.2.1
torchvision            : 0.18.1
tqdm                   : 4.66.4
transformers           : 4.39.3
utilsforecast          : 0.0.10
vowpalwabbit           : None
xgboost                : 2.0.3

Here's the output from the run: Beginning AutoGluon training... Time limit = 800s AutoGluon will save models to 'model' =================== System Info =================== AutoGluon Version: 1.1.1 Python Version: 3.10.12 Operating System: Linux Platform Machine: x86_64 Platform Version: #76~20.04.1-Ubuntu SMP Thu Jun 13 18:00:23 UTC 2024 CPU Count: 8 GPU Count: 0 Memory Avail: 39.17 GB / 57.39 GB (68.2%) Disk Space Avail: 208.05 GB / 250.92 GB (82.9%)

Setting presets to: medium_quality

Fitting with arguments: {'enable_ensemble': True, 'eval_metric': SMAPE, 'freq': 'D', 'hyperparameters': 'light', 'known_covariates_names': [], 'num_val_windows': 1, 'prediction_length': 30, 'quantile_levels': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9], 'random_seed': 123, 'refit_every_n_windows': 1, 'refit_full': False, 'skip_model_selection': False, 'target': 'target', 'time_limit': 800, 'verbosity': 2}

And here's the error message: File , line 1 ----> 1 sv_predictor.fit( 2 train_data, 3 time_limit=800, 4 presets="medium_quality" 5 )

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-bb2e512f-4367-4c0c-875d-9a07e849fbca/lib/python3.10/site-packages/autogluon/core/utils/decorators.py:31, in unpack.._unpack_inner.._call(*args, kwargs) 28 @functools.wraps(f) 29 def _call(*args, kwargs): 30 gargs, gkwargs = g(other_args, args, kwargs) ---> 31 return f(*gargs, gkwargs)

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-bb2e512f-4367-4c0c-875d-9a07e849fbca/lib/python3.10/site-packages/autogluon/timeseries/predictor.py:701, in TimeSeriesPredictor.fit(self, train_data, tuning_data, time_limit, presets, hyperparameters, hyperparameter_tune_kwargs, excluded_model_types, num_val_windows, val_step_size, refit_every_n_windows, refit_full, enable_ensemble, skip_model_selection, random_seed, verbosity) 698 logger.info("\nFitting with arguments:") 699 logger.info(f"{pprint.pformat({k: v for k, v in fit_args.items() if v is not None})}\n") --> 701 train_data = self._check_and_prepare_data_frame(train_data, name="train_data") 702 logger.info(f"Provided train_data has {self._get_dataset_stats(train_data)}") 704 if val_step_size is None:

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-bb2e512f-4367-4c0c-875d-9a07e849fbca/lib/python3.10/site-packages/autogluon/timeseries/predictor.py:314, in TimeSeriesPredictor._check_and_prepare_data_frame(self, data, name) 312 logger.info(f"Inferred time series frequency: '{df.freq}'") 313 else: --> 314 if df.freq != self.freq: 315 logger.warning(f"{name} with frequency '{df.freq}' has been resampled to frequency '{self.freq}'.") 316 df = df.convert_frequency(freq=self.freq)

File /databricks/python/lib/python3.10/site-packages/pandas/core/generic.py:5575, in NDFrame.getattr(self, name) 5568 if ( 5569 name not in self._internal_names_set 5570 and name not in self._metadata 5571 and name not in self._accessors 5572 and self._info_axis._can_hold_identifiers_and_holds_name(name) 5573 ): 5574 return self[name] -> 5575 return object.getattribute(self, name)

AttributeError: 'DataFrame' object has no attribute 'freq'

mpolinowski / automl-gluon-tabular-data

AttributeError when running in Databricks #1