Chiaraplizz / ST-TR

Spatial Temporal Transformer Network for Skeleton-Based Activity Recognition
MIT License
294 stars 57 forks source link

Can't get the correct result of the kinetics dataset #10

Closed Serendipiy2021 closed 3 years ago

Serendipiy2021 commented 3 years ago

Hi Chiara, After reading your paper, I think this is a very meaningful study. I was trying to reproduce the results and only got 0.24% for the top-1. 1.22% for the top-5. I used your pre-trained model(kinetics_spatial.pt) to test the results. Modify channel=3 and double_channel: false in kinetics-skeleton/train.yaml.There is no error in the code running. When i run the temporal transformer stream,the output of the network will become smaller and smaller as the batch_idx increases.When batch_idx>=76, the output will all become 'nan', so I finally got an unsatisfactory result. The attachment is my configuration file and main.py main+train.zip Can you give advice on how to solve this problem? Thanks a lot, best

Chiaraplizz commented 3 years ago

Hi, thanks for your interest in our work. How many GPUs are you using? I found this happened when using only 1 GPU. I updated the code just 5 minutes ago to solve this issue. Could you try to update the code and try again? Let me know if this is solved.

Chiara

ejeon6 commented 3 years ago

Hello Chiaraplizz, I am trying to inference only with S-TR (Spatial Transformer) and kinetics dataset on 1 GPU setting. But, I am facing the same issues Serendipiy2021 shared. Could you fix the bug and share the train.yaml for kinetics? Thank you in advance.

Chiaraplizz commented 3 years ago

Hi!

Is your config like this?

# feeder
feeder: st_gcn.feeder.Feeder_kinetics
train_feeder_args:
  random_choose: True
  random_move: True
  window_size: 150
  data_path: ./kinetics_data/train_data_joint.npy
  label_path: ./kinetics_data/train_label.pkl
test_feeder_args:
  data_path: ./kinetics_data/val_data_joint.npy
  label_path: ./kinetics_data/val_label.pkl

# model
model: st_gcn.net.ST_GCN
model_args:
  num_class: 400
  channel: 3
  window_size: 150
  num_person: 2
  num_point: 18
  dropout: 0
  graph: st_gcn.graph.Kinetics
  graph_args:
    labeling_mode: 'spatial'
  mask_learning: True
  use_data_bn: True
  attention: True
  only_attention: True
  tcn_attention: False
  data_normalization: True
  skip_conn: True
  weight_matrix: 2
  only_temporal_attention: False
  bn_flag: True
  attention_3: False
  kernel_temporal: 9
  more_channels: False
  double_channel: False
  drop_connect: True
  concat_original: True
  all_layers: False
  adjacency: False
  agcn: False
  dv: 0.25
  dk: 0.25
  Nh: 8
  n: 4
  dim_block1: 10
  dim_block2: 30
  dim_block3: 75
  relative: False
  visualization: False

  #optical_flow: True

#optim
weight_decay: 0.0001
base_lr: 0.1
step: [45, 55]

# training
device: [0, 1, 2, 3]
batch_size: 64
test_batch_size: 8
num_epoch: 65
nesterov: True
ejeon6 commented 3 years ago

Hi. By referring to your comments, the problem has been solved. The problem was "only_temporal_attention" and "double_channel" settings. Thank you!