InwoongLee / TS-LSTM

Skeleton-based Action Recognition using TS-LSTM model
121 stars 22 forks source link

3D Human Behavior Understanding using Temporal Sliding LSTM Networks

This is a Tensorflow implementation of Ensemble TS-LSTM v1, v2 and v3 models from the paper Ensemble Deep Learning for Skeleton-based Action Recognition using Temporal Sliding LSTM networks and the paper 3D Human Behavior Understanding using Generalized TS-LSTM Networks. You can see the video for the former paper in Naver D2 or YouTube.

Model architecture

This is also a Tensorflow implementation of Generalized Temporal Sliding LSTM (TS-LSTM) models from the paper 3D Human Behavior Understanding using Generalized TS-LSTM Networks. The generalized TS-LSTM networks consist of multiple TS-LSTM modules, and can be controlled by the hyper-parameters such as the LSTM window sizes, temporal strides and motion feature offsets of TS-LSTM modules.

Model architecture

Requirements (Ubuntu except for Matlab)

Dataset

NTU RGB+D Action Recognition Dataset

We found a few problems with regard to the skeleton data in NTU RGB+D Dataset.

To use noraml skeleton data for correct action recognition, we refined the dataset by removing trash skeletons and determining the indexes of the primary and the secondary actors.

We chose only normal skeleton sequences and finally provide the actor information (Actions_01-49.txt, Actions_50-60.txt) with frame numbers according to above process. The skeleton sequences in 'samples_with_missing_skeletons.txt' are also removed by the process. Also, we upload the code in Matlab, which extracts csv files from txt files provided by ROSE Lab. The codes (make_csv_action_0149.m, make_csv_action_5060.m) cover two cases, Actions 1-49 and 50-60.

Citation

To cite our code or paper, please use these bibtex records:

@InProceedings{Lee_2017_ICCV,
author = {Lee, Inwoong and Kim, Doyoung and Kang, Seoungyoon and Lee, Sanghoon},
title = {Ensemble Deep Learning for Skeleton-Based Action Recognition Using Temporal Sliding LSTM Networks},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}
@article{DBLP:journals/tmm/LeeKL21,
  author    = {Inwoong Lee and
               Doyoung Kim and
               Sanghoon Lee},
  title     = {3-D Human Behavior Understanding Using Generalized {TS-LSTM} Networks},
  journal   = {{IEEE} Trans. Multim.},
  volume    = {23},
  pages     = {415--428},
  year      = {2021},
  url       = {https://doi.org/10.1109/TMM.2020.2978637},
  doi       = {10.1109/TMM.2020.2978637},
  timestamp = {Tue, 02 Mar 2021 11:25:28 +0100},
  biburl    = {https://dblp.org/rec/journals/tmm/LeeKL21.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}