UConn-DSIS / Empowering-Time-Series-Analysis-with-LLM

Official website for "Empowering Time Series Analysis with Large Language Models: A Survey"
MIT License
67 stars 7 forks source link

Empowering-Time-Series-Analysis-with-LLMs

This is the official repository for "Empowering Time Series Analysis with Large Language Models: A Survey" (To appear in IJCAI-24 Survey Track)

This repository is activately maintained by Yushan Jiang and Zijie Pan from UConn DSIS Group led by Dr. Dongjin Song. As this research topic has recently gained significant popularity, with new articles emerging daily, we will update our repository and survey regularly.

If you find some ignored papers, feel free to create pull requests, open issues, or email Yushan & Zijie.

Please consider citing our survey paper if you find it helpful :), and feel free to share this repository with others!

Motivation and Contribution:

The rapid development of LLMs in natural language processing has unveiled unprecedented capabilities in sequential modeling and pattern recognition. It is natural to ask: How can LLMs be effectively leveraged to advance general-purpose time series analysis?

Our survey aims to answer the question based on a thorough overview of existing literature, as shown in Figure 1. We claim that LLMs can serve as a flexible as well as highly competent component in the time series modeling:

1) The flexibility lies in a wide spectrum of available LLMs that can be employed and the variety of ways they can be configured for time series analysis, where we categorize the existing methods into five groups based on the methodology. 2) Regarding their competence, LLMs can be tailored for a wide range of real-world applications with domain-specific context , where we discuss their application tasks and domains. 3) We discuss and highlight future directions that advance time series analysis with LLMs.


Figure 1: The Framework of Our Survey Figure 2: Categorization of Component Design for Fine-tuning Time Series LLMs


Taxonomy of Time Series LLMs

Taxonomy via Methodology

To adopt LLMs for time series analysis, three primary methods are employed: direct querying of LLMs, fine-tuning LLMs with tailored designs, and incorporating LLMs into time series models as a means of feature enhancement (integration). Specifically, three key components can be leveraged to fine-tune LLMs, as shown in Figure 2: The input time series are first tokenized into embedding based on proper tokenization techniques, where proper prompts can be adopted to further enhance the time series representation. As such, LLMs can better comprehend prompt-enhanced time series embedding and be fine-tuned for downstream tasks, based on sophisticated strategies.

Taxonomy via Task and Domain

Table 1: Taxonomy of Time Series LLMs - Methdology, Task and Domain

The data type TS denotes general time series, ST denotes spatial-temporal time series, the prefix M- indicates multi-modal inputs. Q denotes direct query the whole LLMs for output, T denotes the design of time series tokenization, P indicates the design of textual or parameterized time series prompts, FT indicates if the parameters of LLMs are updated (fine-tuned), I indicates if LLMs are integrated as part of the final model for downstream tasks. Code availability is assessed on May 20th, 2024. Method Data Domain Task Q T P FT I LLM Code
Time-LLM
(ICLR 2024)
M-TS General Forecasting LLaMA, GPT-2
OFA
(NeurIPS 2023)
TS General Forecasting,
Classification,
Imputation,
Anomaly Detection
GPT-2
TEMPO
(ICLR2024)
TS General Forecasting GPT-2
TEST
(ICLR2024)
M-TS General Forecasting, Classification BERT, GPT-2,
ChatGLM, LLaMA2
LLM4TS, 2023
TS General Forecasting GPT-2
PromptCast
(IEEE TKDE 2023)
TS General Forecasting Bart, BERT, etc.
LLMTIME
(NeurIPS 2023)
TS General Forecasting GPT-3, LLaMA-2
UniTime
(WWW 2024)
M-TS General Forecasting GPT-2
aLLM4TS
(ICML 2024)
TS General Multiple GPT-2 [✘]
GPT4MTS
(AAAI 2024)
M-TS General Forecasting GPT-2 [✘]
Chronos
TS General Forecasting GPT-2, T5
LAMP
(NeurIPS 2023)
TS General Event Prediction GPT-3 & 3.5,
LLaMA-2
Gunjal et al., 2023 TS General Event Prediction GPT-3.5, Flan-T5, etc.
Yu et al., 2023 M-TS Finance Forecasting GPT-4, Open-LLaMA
Lopez-Lira et al., 2023 M-TS Finance Forecasting ChatGPT
Xie et al., 2023 M-TS Finance Classification ChatGPT
Chen et al., 2023/br>(RobustFin@KDD2023)</sub M-TS Finance Classification ChatGPT -
METS, 2023
M-TS Healthcare Classification ClinicalBERT
Jiang et al.,2023
(Nature Comput. Sci.)
M-TS Healthcare Classification NYUTron(BERT)
Liu et al., 2023 M-TS Healthcare Forecasting, Classification PaLM
AuxMobLCast
(SIGSPATIAL 2022)
ST Mobility Forecasting BERT, RoBERTa,
GPT-2, XLNet
LLM-Mob, 2023 ST Mobility Forecasting GPT-3.5
ST-LLM, 2024 ST Traffic Forecasting LLaMA, GPT-2
GATGPT, 2023 ST Traffic Imputation GPT-2
LA-GCN, 2023 M-ST Vision Classification BERT

To be updated:

Citation

@misc{jiang2024empowering,
      title={Empowering Time Series Analysis with Large Language Models: A Survey}, 
      author={Yushan Jiang and Zijie Pan and Xikun Zhang and Sahil Garg and Anderson Schneider and Yuriy Nevmyvaka and Dongjin Song},
      year={2024},
      eprint={2402.03182},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}