solicucu / D3G

10 stars 1 forks source link

[ICCV2023] D3G:Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation

d3g

Datasets

Please download the video features to directory dataset as follows.

dataset
├── Charades_STA
│   ├── vgg_rgb_features.hdf5
│   ├── glance_charades_train.json
│   ├── charades_test.json
├── ActivityNet
│   ├── sub_activitynet_v1-3.c3d.hdf5
│   ├── glance_train.json
│   ├── val.json
│   ├── test.json
├── TACoS
│   ├── tall_c3d_features.hdf5
│   ├── glance_train.json
│   ├── val.json
│   ├── test.json

Main Results

Charades-STA Dataset

Method Rank1@0.5 Rank1@0.7 Rank5@0.5 Rank5@0.7
ViGA 36.56 16.10 48.90 25.86
D3G 41.64 19.60 79.25 49.30

ActivityNet Captions Dataset

Method Rank1@0.3 Rank1@0.5 Rank1@0.7 Rank5@0.3 Rank5@0.5 Rank5@0.7
ViGA 59.78 35.39 16.25 72.19 53.19 32.69
D3G 58.25 36.68 18.54 87.84 74.21 52.47

TACoS Dataset

Method Rank1@0.3 Rank1@0.5 Rank1@0.7 Rank5@0.3 Rank5@0.5 Rank5@0.7
ViGA 20.82 9.52 3.10 27.92 15.35 6.10
D3G 26.99 12.62 4.77 54.71 31.59 12.10

Training & Inference

cd scipts 
### charades 
sh charades_train.sh # train 
sh charades_test.sh # test 

### activitynet
sh anet_train.sh # train 
sh anet_test.sh # test 

### tacos 
sh tacos_train.sh  # train 
sh tacos_test.sh # test