This is the PyTorch reimplementation of DilatedSLR (IJCAI'18). For technical details, please refer to: Dilated Convolutional Network with Iterative Optimization for Coutinuous Sign Language Recognition [Paper]
The results of this implementation may be slightly different from the original performance reported in the paper. The results in our paper are obtained using the TensorFlow implementation.
If it helps your research, please consider citing the following paper in your publications:
@inproceedings{pu2018dilated,
title={Dilated Convolutional Network with Iterative Optimization for Coutinuous Sign Language Recognition},
author={Pu, Junfu and Zhou, Wengang and Li, Houqiang},
booktitle={International Joint Conference on Artificial Intelligence (IJCAI)},
year={2018}
}
All required python packages are included in requirements.txt
.
You can install the packages by
pip install -r requirements.txt
Please install ctcdecode following the official instruction. If the installation failed with the problem of network connection, please consider installing by
git clone --recursive https://github.com/Jevin754/ctcdecode.git
cd ctcdecode
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple .
When calculating the evaluation metric with the official script provided by RWTH-PHOENIX-Weather, SCKT toolkit is required, one can install the toolkit with the following commands
git clone --recursive https://github.com/usnistgov/SCTK.git
cd SCTK
make config && make all && make check && make install && make doc
# add SCTK path to environment variable PATH
export PATH=$PATH:SCTK_PATH/bin/sclite
For more details about SCTK, please refer to SCTK repository.
Configuring the environment is complicated. Hence, running with Docker is recommended. We provide Dockerfile and image for easy getting start.
To build Docker image from the source, run the following commands
git clone --recursive https://github.com/ustc-slr/DilatedSLR.git
cd DilatedSLR
docker build --no-cache -f ./Dockerfile -t ustcslr .
To run with pre-built docker image, install Docker and start a new container with the docker image via
docker run --runtime=nvidia --rm -it --name ustcslr/ustcslr:latest
All experiments are performed on RWTH-PHOENIX-Weather-2014 (Multisigner). The image/video features are extracted with C3D. We have released the C3D features. Please download the features with the following link:
Extract features via
tar -zxvf c3d_res_phoenix_body_iter5_120k.tar.gz
The training and testing functions are both included in main.py
. The parameters are as follows,
python main.py -h
usage: main.py [-h] [-t TASK] [-g GPU] [-dw DATA_WORKER] [-fd FEATURE_DIM]
[-corp_dir CORPUS_DIR] [-corp_tr CORPUS_TRAIN]
[-corp_te CORPUS_TEST] [-corp_de CORPUS_DEV] [-vp VIDEO_PATH]
[-op OPTIMIZER] [-lr LEARNING_RATE] [-wd WEIGHT_DECAY]
[-mt MOMENTUM] [-nepoch NUM_EPOCH] [-us UPDATE_STEP]
[-upm UPDATE_PARAM] [-db DEBUG] [-lg_d LOG_DIR]
[-bs BATCH_SIZE] [-ckpt CHECK_POINT] [-bwd BEAM_WIDTH]
[-vbs VALID_BATCH_SIZE] [-evalset {test,dev}]
optional arguments:
-h, --help show this help message and exit
-t TASK, --task TASK
-g GPU, --gpu GPU
-dw DATA_WORKER, --data_worker DATA_WORKER
-fd FEATURE_DIM, --feature_dim FEATURE_DIM
-corp_dir CORPUS_DIR, --corpus_dir CORPUS_DIR
-corp_tr CORPUS_TRAIN, --corpus_train CORPUS_TRAIN
-corp_te CORPUS_TEST, --corpus_test CORPUS_TEST
-corp_de CORPUS_DEV, --corpus_dev CORPUS_DEV
-vp VIDEO_PATH, --video_path VIDEO_PATH
-op OPTIMIZER, --optimizer OPTIMIZER
-lr LEARNING_RATE, --learning_rate LEARNING_RATE
-wd WEIGHT_DECAY, --weight_decay WEIGHT_DECAY
-mt MOMENTUM, --momentum MOMENTUM
-nepoch NUM_EPOCH, --num_epoch NUM_EPOCH
-us UPDATE_STEP, --update_step UPDATE_STEP
-upm UPDATE_PARAM, --update_param UPDATE_PARAM
-db DEBUG, --DEBUG DEBUG
-lg_d LOG_DIR, --log_dir LOG_DIR
-bs BATCH_SIZE, --batch_size BATCH_SIZE
-ckpt CHECK_POINT, --check_point CHECK_POINT
-bwd BEAM_WIDTH, --beam_width BEAM_WIDTH
-vbs VALID_BATCH_SIZE, --valid_batch_size VALID_BATCH_SIZE
-evalset {test,dev}, --eval_set {test,dev}
To train on RWTH-PHOENIX-Weather-2014 with C3D features, run
python main.py --task train
--batch_size 20
--log_dir ./log/reimp
--learning_rate 1e-4
--data_worker 8
--video_path C3D_FEATURE_DIR
--gpu GPU_ID
Configure your own settings by modify the parameters if necessary.
Download the pretrained model weights with the following link:
Evaluates the model on RWTH-PHOENIX-Weather-2014 development set (Dev) and testing set (Test),
python main.py --task test
--batch_size 1
--check_point MODEL_WEIGHTS_FILE
--eval_set test
--data_worker 8
--video_path C3D_FEATURE_DIR
--gou GPU_ID
DilatedSLR | Dev (%) | Test (%) | ||||
---|---|---|---|---|---|---|
WER | Del | Ins | WER | Del | Ins | |
w/o post-processing | 37.4 | 8.6 | 4.9 | 37.1 | 8.5 | 4.3 |
+w post-processing | 32.2 | 11.0 | 4.4 | 31.9 | 11.0 | 3.7 |
The official evaluation script merges some sign words with similar meaning but different label, "post-processing" corresponds to the resutls with such operations.
If you have any questions, please feel free to contact pjh@mail.ustc.edu.cn.