amazon-science / chronos-forecasting

Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting
https://arxiv.org/abs/2403.07815
Apache License 2.0
2.32k stars 265 forks source link

[BUG] ValueError: malformed node or string when fine tuning example #151

Closed williamrodz closed 4 weeks ago

williamrodz commented 1 month ago

Bug report checklist Hello! I was attempting to follow the example fine tuning exercise. However, when I run the training.py script, I quickly get a "malformed node or string" error.

Is a CUDA compatible processor required for training? Or is this something else in my setup?

Describe the bug

This is the error stack:

File "/my-path/chronos-forecasting/scripts/training/train.py", line 692, in <module>
    app()
  File "/opt/homebrew/lib/python3.12/site-packages/typer/main.py", line 326, in __call__
    raise e
  File "/opt/homebrew/lib/python3.12/site-packages/typer/main.py", line 309, in __call__
    return get_command(self)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/typer/core.py", line 661, in main
    return _main(
           ^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/typer/core.py", line 193, in _main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/typer/main.py", line 692, in wrapper
    return callback(**use_params)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.12/site-packages/typer_config/decorators.py", line 92, in wrapped
    return cmd(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/williama/Library/Mobile Documents/com~apple~CloudDocs/William's Cloud/Oxford/Dissertation/chronos-forecasting/scripts/training/train.py", line 563, in main
    training_data_paths = ast.literal_eval(training_data_paths)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 112, in literal_eval
    return _convert(node_or_string)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 104, in _convert
    left = _convert_signed_num(node.left)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 85, in _convert_signed_num
    return _convert_num(node)
           ^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 76, in _convert_num
    _raise_malformed_node(node)
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/ast.py", line 73, in _raise_malformed_node
    raise ValueError(msg + f': {node!r}')
ValueError: malformed node or string on line 1: <ast.BinOp object at 0x30a1455d0>

And this my config training/configs/chronos-t5-small.yaml:

training_data_paths:
- "./data_for_ft.arrow"
probability:
- 1.0
context_length: 512
prediction_length: 64
min_past: 60
max_steps: 200_000
save_steps: 100_000
log_steps: 500
per_device_train_batch_size: 32
learning_rate: 0.001
optim: adamw_torch_fused
num_samples: 20
shuffle_buffer_length: 100_000
gradient_accumulation_steps: 1
model_id: google/t5-efficient-small
model_type: seq2seq
random_init: true
tie_embeddings: true
output_dir: ./output/
tf32: true
torch_compile: true
tokenizer_class: "MeanScaleUniformBins"
tokenizer_kwargs:
  low_limit: -15.0
  high_limit: 15.0
n_tokens: 4096
lr_scheduler_type: linear
warmup_ratio: 0.0
dataloader_num_workers: 1
max_missing_prop: 0.9
use_eos_token: true

Expected behavior I expected training to fulfill successfully.

To reproduce

  1. Install the chronos package with the training extra.
  2. Generate sample time series and convert it to an arrow format per the example (https://github.com/amazon-science/chronos-forecasting/tree/main/scripts)
  3. Modify training/configs/chronos-t5-small.yaml to add the data_for_ft.arrow file path with 1.0 probability
  4. Run the training command pointing to the config:
    python training/train.py training/configs/chronos-t5-small.yaml

Environment description Operating system: macOS 14.5 Python version: 3.12 CUDA version: None, running on Mac M1 Pro CPU PyTorch version: 2.3.0 HuggingFace transformers version: 4.40.2 HuggingFace accelerate version: 0.30.1

abdulfatir commented 1 month ago

Yes, you'd typically need a a CUDA compatible GPU for training and fine-tuning. You may be able modify the scripts to support CPU fine-tuning but it may be quite slow.

abdulfatir commented 4 weeks ago

@williamrodz Closing this due to inactivity. Please feel free to re-open if you have questions.