Empowering-Time-Series-Analysis-with-LLMs

This is the official repository for "Empowering Time Series Analysis with Large Language Models: A Survey" (To appear in IJCAI-24 Survey Track)

This repository is activately maintained by Yushan Jiang and Zijie Pan from UConn DSIS Group led by Dr. Dongjin Song. As this research topic has recently gained significant popularity, with new articles emerging daily, we will update our repository and survey regularly.

If you find some ignored papers, feel free to create pull requests, open issues, or email Yushan & Zijie.

Please consider citing our survey paper if you find it helpful :), and feel free to share this repository with others!

Motivation and Contribution:

The rapid development of LLMs in natural language processing has unveiled unprecedented capabilities in sequential modeling and pattern recognition. It is natural to ask: How can LLMs be effectively leveraged to advance general-purpose time series analysis?

Our survey aims to answer the question based on a thorough overview of existing literature, as shown in Figure 1. We claim that LLMs can serve as a flexible as well as highly competent component in the time series modeling:

1) The flexibility lies in a wide spectrum of available LLMs that can be employed and the variety of ways they can be configured for time series analysis, where we categorize the existing methods into five groups based on the methodology. 2) Regarding their competence, LLMs can be tailored for a wide range of real-world applications with domain-specific context , where we discuss their application tasks and domains. 3) We discuss and highlight future directions that advance time series analysis with LLMs.


Figure 1: The Framework of Our Survey	Figure 2: Categorization of Component Design for Fine-tuning Time Series LLMs

Taxonomy of Time Series LLMs

Taxonomy via Methodology

To adopt LLMs for time series analysis, three primary methods are employed: direct querying of LLMs, fine-tuning LLMs with tailored designs, and incorporating LLMs into time series models as a means of feature enhancement (integration). Specifically, three key components can be leveraged to fine-tune LLMs, as shown in Figure 2: The input time series are first tokenized into embedding based on proper tokenization techniques, where proper prompts can be adopted to further enhance the time series representation. As such, LLMs can better comprehend prompt-enhanced time series embedding and be fine-tuned for downstream tasks, based on sophisticated strategies.

Taxonomy via Task and Domain

Forecasting
- (General) Time-LLM: Time Series Forecasting by Reprogramming Large Language Models paper, code
- (General) TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting paper
- (General) LLM4TS: Aligning Pre-Trained LLMs as Data-Efficient Time-Series Forecasters paper
- (General) PromptCast: A New Prompt-based Learning Paradigm for Time Series Forecasting paper, code
- (General) Large Language Models Are Zero-Shot Time Series Forecasters paper, code
- (General) GPT4MTS: Prompt-based Large Language Model for Multimodal Time-series Forecasting paper
- (General) Chronos: Learning the Language of Time Series paper code
- (General) Multi-Patch Prediction: Adapting LLMs for Time Series Representation Learning paper
- (General) UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting paper code
- (Finance) Temporal Data Meets LLM -- Explainable Financial Time Series Forecasting paper, dataset
- (Mobility) Leveraging Language Foundation Models for Human Mobility Forecasting paper, code
- (Mobility) Where Would I Go Next? Large Language Models as Human Mobility Predictors paper, code
- (Traffic) Spatial-Temporal Large Language Model for Traffic Prediction paper code
Classification
- (Finance) The Wall Street Neophyte: A Zero-Shot Analysis of ChatGPT Over MultiModal Stock Movement Prediction Challenges
- (Finance) ChatGPT Informed Graph Neural Network for Stock Movement Prediction paper
- (Healthcare) Frozen Language Model Helps ECG Zero-Shot Learning paper
- (Healthcare) Health system-scale language models are all-purpose prediction engines paper, code
- (Vision) Language Knowledge-Assisted Representation Learning for Skeleton-Based Action Recognition paper, code
Imputation
- (Traffic) GATGPT: A Pre-trained Large Language Model with Graph Attention Network for Spatiotemporal Imputation paper
Event Prediction
- (General) Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning paper code
- (General) Drafting Event Schemas using Language Models papers
Multiple Tasks
- Forecasting & Classification & Imputation & Anomaly Detection (General) One Fits All:Power General Time Series Analysis by Pretrained LM paper code
- Forecasting & Classification (General) TEST: Text Prototype Aligned Embedding to Activate LLM's Ability for Time Series paper code
- Forecasting & Classification (Healthcare) Large Language Models are Few-Shot Health Learners paper

Table 1: Taxonomy of Time Series LLMs - Methdology, Task and Domain

_{The data type TS denotes general time series, ST denotes spatial-temporal time series, the prefix M- indicates multi-modal inputs. Q denotes direct query the whole LLMs for output, T denotes the design of time series tokenization, P indicates the design of textual or parameterized time series prompts, FT indicates if the parameters of LLMs are updated (fine-tuned), I indicates if LLMs are integrated as part of the final model for downstream tasks. Code availability is assessed on May 20th, 2024.}	_Method	_Data	_Domain	_Task	_Q	_T	_P	_FT	_I	_LLM
_{Time-LLM (ICLR 2024)}	_M-TS	_General	_Forecasting	✘	✔	✔	✔	✘	_{LLaMA, GPT-2}	✔
_{OFA (NeurIPS 2023)}	_TS	_General	_{Forecasting, Classification, Imputation, Anomaly Detection}	✘	✔	✘	✔	✘	_GPT-2	✔
_{TEMPO (ICLR2024)}	_TS	_General	_Forecasting	✘	✔	✔	✔	✘	_GPT-2	_✘
_{TEST (ICLR2024)}	_M-TS	_General	_{Forecasting, Classification}	✘	✔	✔	✘	✔	_{BERT, GPT-2, ChatGLM, LLaMA2}	✔
_{LLM4TS, 2023}	_TS	_General	_Forecasting	✘	✔	✘	✔	✘	_GPT-2	_✘
_{PromptCast (IEEE TKDE 2023)}	_TS	_General	_Forecasting	✔	✘	✔	✘	✘	_{Bart, BERT, etc.}	✔
_{LLMTIME (NeurIPS 2023)}	_TS	_General	_Forecasting	✔	✔	✘	✘	✘	_{GPT-3, LLaMA-2}	✔
_{UniTime (WWW 2024)}	_M-TS	_General	_Forecasting	✘	✔	✔	✔	✔	_GPT-2	✔
_{aLLM4TS (ICML 2024)}	_TS	_General	_Multiple	✘	✔	✘	✔	✘	_GPT-2	[✘]
_{GPT4MTS (AAAI 2024)}	_M-TS	_General	_Forecasting	✘	✔	✔	✔	✘	_GPT-2	[✘]
_Chronos	_TS	_General	_Forecasting	✘	✔	✘	✔	✘	_{GPT-2, T5}	✔
_{LAMP (NeurIPS 2023)}	_TS	_General	_{Event Prediction}	✔	✘	✔	✘	✔	_{GPT-3 & 3.5, LLaMA-2}	✔
_{Gunjal et al., 2023}	_TS	_General	_{Event Prediction}	✔	✘	✔	✘	✘	_{GPT-3.5, Flan-T5, etc.}	_✘
_{Yu et al., 2023}	_M-TS	_Finance	_Forecasting	✔	✘	✔	✔	✘	_{GPT-4, Open-LLaMA}	_✘
_{Lopez-Lira et al., 2023}	_M-TS	_Finance	_Forecasting	✔	✘	✔	✘	✔	_ChatGPT	_✘
_{Xie et al., 2023}	_M-TS	_Finance	_{Classification}	✔	✘	✔	✘	✘	_ChatGPT	_✘
_{Chen et al., 2023/br>(RobustFin@KDD2023)</sub}	_M-TS	_Finance	_{Classification}	✘	✘	✔	✘	✔	_ChatGPT	-
_{METS, 2023}	_M-TS	_Healthcare	_{Classification}	✔	✘	✔	✘	✔	_ClinicalBERT	_✘
_{Jiang et al.,2023 (Nature Comput. Sci.)}	_M-TS	_Healthcare	_{Classification}	✘	✘	✘	✔	✘	_{NYUTron(BERT)}	✔
_{Liu et al., 2023}	_M-TS	_Healthcare	_{Forecasting, Classification}	✔	✘	✔	✔	✘	_PaLM	_✘
_{AuxMobLCast (SIGSPATIAL 2022)}	_ST	_Mobility	_Forecasting	✘	✘	✔	✔	✔	_{BERT, RoBERTa, GPT-2, XLNet}	✔
_{LLM-Mob, 2023}	_ST	_Mobility	_Forecasting	✔	✘	✔	✘	✘	_GPT-3.5	✔
_{ST-LLM, 2024}	_ST	_Traffic	_Forecasting	✘	✔	✘	✔	✘	_{LLaMA, GPT-2}	_✔
_{GATGPT, 2023}	_ST	_Traffic	_Imputation	✘	✔	✘	✔	✘	_GPT-2	_✘
_{LA-GCN, 2023}	_M-ST	_Vision	_{Classification}	✘	✔	✘	✘	✔	_BERT	✔

To be updated:

(General) Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities paper
(General) Large Language Models for Spatial Trajectory Patterns Mining paper
(General) LSTPrompt: Large Language Models as Zero-Shot Time Series Forecasters by Long-Short-Term Prompting paper
(General) AutoTimes: Autoregressive Time Series Forecasters via Large Language Models paper code
(General) S²IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting paper
(General) How Can Large Language Models Understand Spatial-Temporal Data? paper
(General) Large Language Model Guided Knowledge Distillation for Time Series Anomaly Detection paper
(General) Taming Pre-trained LLMs for Generalised Time Series Forecasting via Cross-modal Knowledge Distillation paper code
(General) K-Link: Knowledge-Link Graph from LLMs for Enhanced Representation Learning in Multivariate Time-Series Data paper
(General) UrbanGPT: Spatio-Temporal Large Language Models paper code
(General) Large language models can be zero-shot anomaly detectors for time series? paper
(General) TimeCMA: Towards LLM-Empowered Time Series Forecasting via Cross-Modality Alignment paper
(General) CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-Tuning paper code
(Mobility) MobilityGPT: Enhanced Human Mobility Modeling with a GPT model paper
(Mobility) Exploring Large Language Models for Human Mobility Prediction under Public Events paper
(Sensing) Evaluating Large Language Models as Virtual Annotators for Time-series Physical Sensing Data paper
(Sensing) TENT: Connect Language Models with IoT Sensors for Zero-Shot Activity Recognition paper
(Sensing) GG-LLM: Geometrically Grounding Large Language Models for Zero-shot Human Activity Forecasting in Human-Aware Task Planning paper
(Sensing) Unleashing the Power of Shared Label Structures for Human Activity Recognition paper
(Healthcare) JoLT: Jointly Learned Representations of Language and Time-Series paper
(Healthcare) REALM: RAG-Driven Enhancement of Multimodal Electronic Health Records Analysis via Large Language Models paper
(Healthcare) Multimodal Pretraining of Medical Time Series and Notes paper code
(Traffic) Language-Guided Traffic Simulation via Scene-Level Diffusion paper
(Traffic) TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models paper
(Energy) Utilizing Language Models for Energy Load Forecasting paper code
(Finance) Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction paper
(Finance) AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework paper code

Citation

@misc{jiang2024empowering,
      title={Empowering Time Series Analysis with Large Language Models: A Survey}, 
      author={Yushan Jiang and Zijie Pan and Xikun Zhang and Sahil Garg and Anderson Schneider and Yuriy Nevmyvaka and Dongjin Song},
      year={2024},
      eprint={2402.03182},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

UConn-DSIS / Empowering-Time-Series-Analysis-with-LLM

readme