amazon-science / chronos-forecasting

Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting
https://arxiv.org/abs/2403.07815
Apache License 2.0
2.02k stars 238 forks source link

HuggingFace interface is broken #78

Closed lampretl closed 1 month ago

lampretl commented 1 month ago

Not sure if this is the right place to open an issue, but the HuggingFace interface for using Amazon's Chronos models for time-series forecasting seems to be broken. Using the official code to import a model:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("time-series-forecasting", model="amazon/chronos-t5-large")

KeyError Traceback (most recent call last) Cell In[3], line 4 1 # Use a pipeline as a high-level helper 2 from transformers import pipeline ----> 4 pipe = pipeline("time-series-forecasting", model="amazon/chronos-t5-large")

File ~/anaconda3/envs/BS5E_data_science/lib/python3.9/site-packages/transformers/pipelines/init.py:859, in pipeline(task, model, config, tokenizer, feature_extractor, image_processor, framework, revision, use_fast, token, device, device_map, torch_dtype, trust_remote_code, model_kwargs, pipeline_class, kwargs) 852 pipeline_class = get_class_from_dynamic_module( 853 class_ref, 854 model, 855 code_revision=code_revision, 856 hub_kwargs, 857 ) 858 else: --> 859 normalized_task, targeted_task, task_options = check_task(task) 860 if pipeline_class is None: 861 pipeline_class = targeted_task["impl"]

File ~/anaconda3/envs/BS5E_data_science/lib/python3.9/site-packages/transformers/pipelines/init.py:543, in check_task(task) 498 def check_task(task: str) -> Tuple[str, Dict, Any]: 499 """ 500 Checks an incoming task string, to validate it's correct and return the default Pipeline and Model classes, and 501 default models if they exist. (...) 541 542 """ --> 543 return PIPELINE_REGISTRY.check_task(task)

File ~/anaconda3/envs/BS5E_data_science/lib/python3.9/site-packages/transformers/pipelines/base.py:1281, in PipelineRegistry.check_task(self, task) 1278 return task, targeted_task, (tokens[1], tokens[3]) 1279 raise KeyError(f"Invalid translation task {task}, use 'translation_XX_to_YY' format") -> 1281 raise KeyError( 1282 f"Unknown task {task}, available tasks are {self.get_supported_tasks() + ['translation_XX_to_YY']}" 1283 )

KeyError: "Unknown task time-series-forecasting, available tasks are ['audio-classification', 'automatic-speech-recognition', 'conversational', 'depth-estimation', 'document-question-answering', 'feature-extraction', 'fill-mask', 'image-classification', 'image-feature-extraction', 'image-segmentation', 'image-to-image', 'image-to-text', 'mask-generation', 'ner', 'object-detection', 'question-answering', 'sentiment-analysis', 'summarization', 'table-question-answering', 'text-classification', 'text-generation', 'text-to-audio', 'text-to-speech', 'text2text-generation', 'token-classification', 'translation', 'video-classification', 'visual-question-answering', 'vqa', 'zero-shot-audio-classification', 'zero-shot-classification', 'zero-shot-image-classification', 'zero-shot-object-detection', 'translation_XX_to_YY']"

# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("amazon/chronos-t5-large")
model = AutoModelForSeq2SeqLM.from_pretrained("amazon/chronos-t5-large")

OSError Traceback (most recent call last) Cell In[2], line 4 1 # Load model directly 2 from transformers import AutoTokenizer, AutoModelForSeq2SeqLM ----> 4 tokenizer = AutoTokenizer.from_pretrained("amazon/chronos-t5-large") 5 model = AutoModelForSeq2SeqLM.from_pretrained("amazon/chronos-t5-large")

File ~/anaconda3/envs/BS5E_data_science/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py:855, in AutoTokenizer.from_pretrained(cls, pretrained_model_name_or_path, *inputs, *kwargs) 853 tokenizer_class_py, tokenizer_class_fast = TOKENIZER_MAPPING[type(config)] 854 if tokenizer_class_fast and (use_fast or tokenizer_class_py is None): --> 855 return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, inputs, **kwargs) 856 else: 857 if tokenizer_class_py is not None:

File ~/anaconda3/envs/BS5E_data_science/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:2070, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, trust_remote_code, *init_inputs, **kwargs) 2064 logger.info( 2065 f"Can't load following files from cache: {unresolved_files} and cannot check if these " 2066 "files are necessary for the tokenizer to operate." 2067 ) 2069 if all(full_file_name is None for full_file_name in resolved_vocab_files.values()): -> 2070 raise EnvironmentError( 2071 f"Can't load tokenizer for '{pretrained_model_name_or_path}'. If you were trying to load it from " 2072 "'https://huggingface.co/models', make sure you don't have a local directory with the same name. " 2073 f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory " 2074 f"containing all relevant files for a {cls.name} tokenizer." 2075 ) 2077 for file_id, file_path in vocab_files.items(): 2078 if file_id not in resolved_vocab_files:

OSError: Can't load tokenizer for 'amazon/chronos-t5-large'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'amazon/chronos-t5-large' is the correct path to a directory containing all relevant files for a T5TokenizerFast tokenizer.

I tried running this on my local computer with Linux Mint as well as in AWS SageMaker instances, all attempts have failed.

abdulfatir commented 1 month ago

Hi @lampretl! Chronos doesn't have an interface in HuggingFace transformers yet. You will need to install the chronos package in this repo and then use ChronosPipeline. Please check the Readme in this repo for usage.