Chiaraplizz / ST-TR

Spatial Temporal Transformer Network for Skeleton-Based Activity Recognition
MIT License
299 stars 58 forks source link

Reproducibility Issue of S-TR and T-TR on NTU-60 x-sub #30

Open ZhouYuxuanYX opened 3 years ago

ZhouYuxuanYX commented 3 years ago

I have followed your instructions on Github and used the following configuration for S-TR and T-TR respectively, but I only got 83% top-1 acc for S-TR and 58% top-1 acc for T-TR (much lower than the number listed in your paper, especially for T-TR).

I don't know if the dim_block 1-3 should be set to 64, 128, 256. However, even with larger block dim, T-TR still seems to perform much worse than S-TR. I am looking forward to hearing from you.

For S-TR:

feeder

feeder: st_gcn.feeder.Feeder feeder_augmented: st_gcn.feeder.FeederAugmented train_feeder_args: data_path: /ssd2/zhouyuxuan/Dataset/NTU-RGB-D/xsub/train_data.npy label_path: /ssd2/zhouyuxuan/Dataset/NTU-RGB-D/xsub/train_label.pkl

random_choose: False random_shift: False random_move: False window_size: -1 normalization: False mirroring: False

test_feeder_args: data_path: /ssd2/zhouyuxuan/Dataset/NTU-RGB-D/xsub/val_data.npy label_path: /ssd2/zhouyuxuan/Dataset/NTU-RGB-D/xsub/val_label.pkl

model

model: st_gcn.net.ST_GCN training: True

model_args: num_class: 60 channel: 3 window_size: 300 num_point: 25 num_person: 2 mask_learning: True use_data_bn: True attention: True only_attention: True tcn_attention: False data_normalization: True skip_conn: True weight_matrix: 2 only_temporal_attention: True bn_flag: True attention_3: False kernel_temporal: 9 more_channels: False double_channel: False drop_connect: True concat_original: True all_layers: False adjacency: False agcn: False dv: 0.25 dk: 0.25 Nh: 8 n: 4 dim_block1: 10 dim_block2: 30 dim_block3: 75 relative: False graph: st_gcn.graph.NTU_RGB_D visualization: False graph_args: labeling_mode: 'spatial'

optical_flow: True

optim

0: old one, 1: new one

scheduler: 1 weight_decay: 0.0001 base_lr: 0.1 step: [60,90]

training

device: [0,1,2,3] batch_size: 32 test_batch_size: 64 num_epoch: 120 nesterov: True

For T-TR:

feeder

feeder: st_gcn.feeder.Feeder feeder_augmented: st_gcn.feeder.FeederAugmented train_feeder_args: data_path: /ssd2/zhouyuxuan/Dataset/NTU-RGB-D/xsub/train_data.npy label_path: /ssd2/zhouyuxuan/Dataset/NTU-RGB-D/xsub/train_label.pkl

random_choose: False random_shift: False random_move: False window_size: -1 normalization: False mirroring: False

test_feeder_args: data_path: /ssd2/zhouyuxuan/Dataset/NTU-RGB-D/xsub/val_data.npy label_path: /ssd2/zhouyuxuan/Dataset/NTU-RGB-D/xsub/val_label.pkl

model

model: st_gcn.net.ST_GCN training: True

model_args: num_class: 60 channel: 3 window_size: 300 num_point: 25 num_person: 2 mask_learning: True use_data_bn: True attention: False only_attention: True tcn_attention: True data_normalization: True skip_conn: True weight_matrix: 2 only_temporal_attention: True bn_flag: True attention_3: False kernel_temporal: 9 more_channels: False double_channel: False drop_connect: True concat_original: True all_layers: False adjacency: False agcn: False dv: 0.25 dk: 0.25 Nh: 8 n: 4 dim_block1: 10 dim_block2: 30 dim_block3: 75 relative: False graph: st_gcn.graph.NTU_RGB_D visualization: False graph_args: labeling_mode: 'spatial'

optical_flow: True

optim

0: old one, 1: new one

scheduler: 1 weight_decay: 0.0001 base_lr: 0.1 step: [60,90]

training

device: [0,1,2,3] batch_size: 32 test_batch_size: 64 num_epoch: 120 nesterov: True

Chiaraplizz commented 3 years ago

Hi Zhou,

Did you try to load the pre-trained models to check if everything match?

Chiara

ZhouYuxuanYX commented 3 years ago

Hi Zhou,

Did you try to load the pre-trained models to check if everything match?

Chiara

Hi Chiara,

thank you for your quick reply. Could you just send me the correct configuration file for your tests?

I found that changing the block dim to 64, 128 and 256 helps a little bit, and setting only_temporal_attention to False also helps a lot. However, the result still does not match the result reported for t-tr.

Method | X-Sub (w/o bones) | Config provided | Large block dim | only_temp_attn False S-TR | 83 | - | - T-TR | 58 | 67 | 77

I didn't load the pre-trained models because I want to have a correct configuration to reproduce the result, in order to try something new based on your proposed model. So I don't think it make much sense for me to use the pertained model directly w/o knowing the correct configuration.

Following your hint, I tried to load the pretrained model just now and it did yield size mismatch errors. It seems that the model size in my configuration is still too small. But it's not straightforward for me to figure out the correct sizes for different blocks. I think it is important to provide the correct configuration, otherwise we can not even use the pertained model.

Best regards

ZhouYuxuanYX commented 3 years ago

I tried for 1 hour and could not figure out a correct configuration myself. I don't think your hint helps much.

Chiaraplizz commented 3 years ago

Dear Zhou,

This is the configuration I used to obtain 87.3 on T-TR with X-Sub:

feeder

feeder: st_gcn.feeder.Feeder feeder_augmented: st_gcn.feeder.FeederAugmented train_feeder_args: data_path: ./Output_skeletons_without_missing_skeletons/xsub/train_data_joint_bones.npy label_path: ./Output_skeletons_without_missing_skeletons/xsub/train_label_filtered.pkl

random_choose: False random_shift: False random_move: False window_size: -1 normalization: False mirroring: False

test_feeder_args: data_path: ./Output_skeletons_without_missing_skeletons/xsub/val_data_joint_bones.npy label_path: ./Output_skeletons_without_missing_skeletons/xsub/val_label_filtered.pkl

model

model: st_gcn.net.ST_GCN training: True

model_args: num_class: 60 channel: 6 window_size: 300 num_point: 25 num_person: 2 mask_learning: True use_data_bn: True attention: False only_attention: True tcn_attention: True data_normalization: True skip_conn: True weight_matrix: 2 only_temporal_attention: True bn_flag: True attention_3: False kernel_temporal: 9 more_channels: False double_channel: True drop_connect: True concat_original: True all_layers: False adjacency: False agcn: False dv: 0.25 dk: 0.25 Nh: 8 n: 4 dim_block1: 10 dim_block2: 30 dim_block3: 75 relative: False graph: st_gcn.graph.NTU_RGB_D visualization: False graph_args: labeling_mode: 'spatial'

optical_flow: True

optim

0: old one, 1: new one

scheduler: 1 weight_decay: 0.0001 base_lr: 0.1 step: [60,90]

training

device: [0,1,2,3] batch_size: 32 test_batch_size: 8 num_epoch: 120 nesterov: True

Hope this helps.

Chiara

ZhouYuxuanYX commented 3 years ago

Dear Zhou,

This is the configuration I used to obtain 87.3 on T-TR with X-Sub:

feeder

feeder: st_gcn.feeder.Feeder feeder_augmented: st_gcn.feeder.FeederAugmented train_feeder_args: data_path: ./Output_skeletons_without_missing_skeletons/xsub/train_data_joint_bones.npy label_path: ./Output_skeletons_without_missing_skeletons/xsub/train_label_filtered.pkl

random_choose: False random_shift: False random_move: False window_size: -1 normalization: False mirroring: False

test_feeder_args: data_path: ./Output_skeletons_without_missing_skeletons/xsub/val_data_joint_bones.npy label_path: ./Output_skeletons_without_missing_skeletons/xsub/val_label_filtered.pkl

model

model: st_gcn.net.ST_GCN training: True

model_args: num_class: 60 channel: 6 window_size: 300 num_point: 25 num_person: 2 mask_learning: True use_data_bn: True attention: False only_attention: True tcn_attention: True data_normalization: True skip_conn: True weight_matrix: 2 only_temporal_attention: True bn_flag: True attention_3: False kernel_temporal: 9 more_channels: False double_channel: True drop_connect: True concat_original: True all_layers: False adjacency: False agcn: False dv: 0.25 dk: 0.25 Nh: 8 n: 4 dim_block1: 10 dim_block2: 30 dim_block3: 75 relative: False graph: st_gcn.graph.NTU_RGB_D visualization: False graph_args: labeling_mode: 'spatial'

optical_flow: True

optim

0: old one, 1: new one

scheduler: 1 weight_decay: 0.0001 base_lr: 0.1 step: [60,90]

training

device: [0,1,2,3] batch_size: 32 test_batch_size: 8 num_epoch: 120 nesterov: True

Hope this helps.

Chiara

Hi Chiara,

thank you for your reply.

I was indeed asking for the configuration for training T-TR w/o bones. Except for the double_channel and channel (changed when bones are considered), it is exactly the same setup that I used in my initial experiment which only reaches 58% acc.

In addition, if I use this configuration to retrain from the checkpoint ntu60_xsub_bones_temporal.pt, it raises the following error:

Original Traceback (most recent call last):
File "/home/zhouyuxuan.zyx/software/anaconda3/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker output = module(*input, kwargs)
File "/home/zhouyuxuan.zyx/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, *kwargs)
File "/ssd2/zhouyuxuan/Repositories/ST-TR/codes/st_gcn/net/st_gcn.py", line 266, in forward
x = self.data_bn(x)
File "/home/zhouyuxuan.zyx/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(
input,
kwargs)
File "/home/zhouyuxuan.zyx/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 167, in forward
return F.batch_norm(
File "/home/zhouyuxuan.zyx/software/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2281, in batch_norm
return torch.batch_norm(
RuntimeError: running_mean should contain 150 elements not 300

So I doubt if this configuration can reach 87% from 58% by only adding extra bones to the input...

Could you please provide me the correct configuration for T-TR w/bones? It should reach 86% top-1 acc w/o using bones.

Many thanks!

ZhouYuxuanYX commented 3 years ago

I also found out that even for your provided ntu60_xsub_bones_temporal.pt, there are weights of tcns included, but the sizes of the tcns do not match with your provided configuration file. I obtained the following info by setting only_temporal_attention=False in your provided configuration and retrain from ntu60_xsub_bones_temporal.pt:

RuntimeError: Error(s) in loading state_dict for Model: size mismatch for backbone.3.tcn1.qkv_conv.weight: copying a param with shape torch.Size([384, 256, 1, 2]) from checkpoint, the shape in current model is torch.Size([192, 256, 1, 2]). size mismatch for backbone.3.tcn1.qkv_conv.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([192]). size mismatch for backbone.3.tcn1.attn_out.weight: copying a param with shape torch.Size([256, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]). size mismatch for backbone.3.tcn1.attn_out.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for backbone.4.tcn1.qkv_conv.weight: copying a param with shape torch.Size([384, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 256, 1, 1]). size mismatch for backbone.4.tcn1.qkv_conv.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([192]). size mismatch for backbone.4.tcn1.attn_out.weight: copying a param with shape torch.Size([256, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]). size mismatch for backbone.4.tcn1.attn_out.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for backbone.5.tcn1.qkv_conv.weight: copying a param with shape torch.Size([384, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 256, 1, 1]). size mismatch for backbone.5.tcn1.qkv_conv.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([192]). size mismatch for backbone.5.tcn1.attn_out.weight: copying a param with shape torch.Size([256, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]). size mismatch for backbone.5.tcn1.attn_out.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for backbone.6.tcn1.qkv_conv.weight: copying a param with shape torch.Size([768, 512, 1, 2]) from checkpoint, the shape in current model is torch.Size([384, 512, 1, 2]). size mismatch for backbone.6.tcn1.qkv_conv.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for backbone.6.tcn1.attn_out.weight: copying a param with shape torch.Size([512, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 1, 1]). size mismatch for backbone.6.tcn1.attn_out.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for backbone.7.tcn1.qkv_conv.weight: copying a param with shape torch.Size([768, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 512, 1, 1]). size mismatch for backbone.7.tcn1.qkv_conv.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for backbone.7.tcn1.attn_out.weight: copying a param with shape torch.Size([512, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 1, 1]). size mismatch for backbone.7.tcn1.attn_out.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for backbone.8.tcn1.qkv_conv.weight: copying a param with shape torch.Size([768, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 512, 1, 1]). size mismatch for backbone.8.tcn1.qkv_conv.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for backbone.8.tcn1.attn_out.weight: copying a param with shape torch.Size([512, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 1, 1]). size mismatch for backbone.8.tcn1.attn_out.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).

I hope you could double-check your provided configuration file and please let me know when you find the correct one, thank you!

Chiaraplizz commented 3 years ago

I also found out that even for your provided ntu60_xsub_bones_temporal.pt, there are weights of tcns included, but the sizes of the tcns do not match with your provided configuration file. I obtained the following info by setting only_temporal_attention=False in your provided configuration and retrain from ntu60_xsub_bones_temporal.pt:

RuntimeError: Error(s) in loading state_dict for Model: size mismatch for backbone.3.tcn1.qkv_conv.weight: copying a param with shape torch.Size([384, 256, 1, 2]) from checkpoint, the shape in current model is torch.Size([192, 256, 1, 2]). size mismatch for backbone.3.tcn1.qkv_conv.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([192]). size mismatch for backbone.3.tcn1.attn_out.weight: copying a param with shape torch.Size([256, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]). size mismatch for backbone.3.tcn1.attn_out.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for backbone.4.tcn1.qkv_conv.weight: copying a param with shape torch.Size([384, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 256, 1, 1]). size mismatch for backbone.4.tcn1.qkv_conv.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([192]). size mismatch for backbone.4.tcn1.attn_out.weight: copying a param with shape torch.Size([256, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]). size mismatch for backbone.4.tcn1.attn_out.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for backbone.5.tcn1.qkv_conv.weight: copying a param with shape torch.Size([384, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([192, 256, 1, 1]). size mismatch for backbone.5.tcn1.qkv_conv.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([192]). size mismatch for backbone.5.tcn1.attn_out.weight: copying a param with shape torch.Size([256, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]). size mismatch for backbone.5.tcn1.attn_out.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for backbone.6.tcn1.qkv_conv.weight: copying a param with shape torch.Size([768, 512, 1, 2]) from checkpoint, the shape in current model is torch.Size([384, 512, 1, 2]). size mismatch for backbone.6.tcn1.qkv_conv.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for backbone.6.tcn1.attn_out.weight: copying a param with shape torch.Size([512, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 1, 1]). size mismatch for backbone.6.tcn1.attn_out.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for backbone.7.tcn1.qkv_conv.weight: copying a param with shape torch.Size([768, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 512, 1, 1]). size mismatch for backbone.7.tcn1.qkv_conv.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for backbone.7.tcn1.attn_out.weight: copying a param with shape torch.Size([512, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 1, 1]). size mismatch for backbone.7.tcn1.attn_out.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for backbone.8.tcn1.qkv_conv.weight: copying a param with shape torch.Size([768, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([384, 512, 1, 1]). size mismatch for backbone.8.tcn1.qkv_conv.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([384]). size mismatch for backbone.8.tcn1.attn_out.weight: copying a param with shape torch.Size([512, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 128, 1, 1]). size mismatch for backbone.8.tcn1.attn_out.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).

I hope you could double-check your provided configuration file and please let me know when you find the correct one, thank you!

This error raises because you set only_temporal_attention=False. Instead, it should be set to True.

Chiara

Chiaraplizz commented 3 years ago

Dear Zhou, This is the configuration I used to obtain 87.3 on T-TR with X-Sub:

feeder

feeder: st_gcn.feeder.Feeder feeder_augmented: st_gcn.feeder.FeederAugmented train_feeder_args: data_path: ./Output_skeletons_without_missing_skeletons/xsub/train_data_joint_bones.npy label_path: ./Output_skeletons_without_missing_skeletons/xsub/train_label_filtered.pkl random_choose: False random_shift: False random_move: False window_size: -1 normalization: False mirroring: False test_feeder_args: data_path: ./Output_skeletons_without_missing_skeletons/xsub/val_data_joint_bones.npy label_path: ./Output_skeletons_without_missing_skeletons/xsub/val_label_filtered.pkl

model

model: st_gcn.net.ST_GCN training: True model_args: num_class: 60 channel: 6 window_size: 300 num_point: 25 num_person: 2 mask_learning: True use_data_bn: True attention: False only_attention: True tcn_attention: True data_normalization: True skip_conn: True weight_matrix: 2 only_temporal_attention: True bn_flag: True attention_3: False kernel_temporal: 9 more_channels: False double_channel: True drop_connect: True concat_original: True all_layers: False adjacency: False agcn: False dv: 0.25 dk: 0.25 Nh: 8 n: 4 dim_block1: 10 dim_block2: 30 dim_block3: 75 relative: False graph: st_gcn.graph.NTU_RGB_D visualization: False graph_args: labeling_mode: 'spatial'

optical_flow: True

optim

0: old one, 1: new one

scheduler: 1 weight_decay: 0.0001 base_lr: 0.1 step: [60,90]

training

device: [0,1,2,3] batch_size: 32 test_batch_size: 8 num_epoch: 120 nesterov: True Hope this helps. Chiara

Hi Chiara,

thank you for your reply.

I was indeed asking for the configuration for training T-TR w/o bones. Except for the double_channel and channel (changed when bones are considered), it is exactly the same setup that I used in my initial experiment which only reaches 58% acc.

In addition, if I use this configuration to retrain from the checkpoint ntu60_xsub_bones_temporal.pt, it raises the following error:

Original Traceback (most recent call last): File "/home/zhouyuxuan.zyx/software/anaconda3/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker output = module(*input, kwargs) File "/home/zhouyuxuan.zyx/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, *kwargs) File "/ssd2/zhouyuxuan/Repositories/ST-TR/codes/st_gcn/net/st_gcn.py", line 266, in forward x = self.data_bn(x) File "/home/zhouyuxuan.zyx/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(input, kwargs) File "/home/zhouyuxuan.zyx/software/anaconda3/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 167, in forward return F.batch_norm( File "/home/zhouyuxuan.zyx/software/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2281, in batch_norm return torch.batch_norm( RuntimeError: running_mean should contain 150 elements not 300

So I doubt if this configuration can reach 87% from 58% by only adding extra bones to the input...

Could you please provide me the correct configuration for T-TR w/bones? It should reach 86% top-1 acc w/o using bones.

Many thanks!

I should check if the configuration I uploaded was correct, I will do it next days because I need to access to an old server. Could you please in the meanwhile check if the same error happens when you re-load the T-TR x-view configuration?

Kind regards, Chiara

nadaselham commented 3 years ago

Traceback (most recent call last): File "main.py", line 982, in processor.start() File "main.py", line 894, in start self.train(epoch, save_model=save_model) File "main.py", line 549, in train output = self.model(data, label, name) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply raise output File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker output = module(*input, *kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(input, kwargs) File "/home/nada/Graph/STTR_selection/net/st_gcn.py", line 268, in forward x = self.data_bn(x) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 83, in forward exponential_average_factor, self.eps) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/functional.py", line 1697, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled RuntimeError: running_mean should contain 150 elements not 300

Dear Chiara, I have faced this problem while running your code under this configuration on the x-view, I hope you could check your provided files and please let me know where is the problem in here. thank you!

mexiQQ commented 2 years ago

Traceback (most recent call last): File "main.py", line 982, in processor.start() File "main.py", line 894, in start self.train(epoch, save_model=save_model) File "main.py", line 549, in train output = self.model(data, label, name) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply raise output File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker output = module(*input, kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call* result = self.forward(input, kwargs) File "/home/nada/Graph/STTR_selection/net/st_gcn.py", line 268, in forward x = self.data_bn(x) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 83, in forward exponential_average_factor, self.eps) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/functional.py", line 1697, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled RuntimeError: running_mean should contain 150 elements not 300

Dear Chiara, I have faced this problem while running your code under this configuration on the x-view, I hope you could check your provided files and please let me know where is the problem in here. thank you!

model channels should be 3 for only joints data and 6 for mixed bones data

nadaselham commented 2 years ago

Thank you very much, I have solved the issue. Appreciate it

On Sat, Mar 5, 2022 at 2:21 AM 十四 @.***> wrote:

Traceback (most recent call last): File "main.py", line 982, in processor.start() File "main.py", line 894, in start self.train(epoch, save_model=save_model) File "main.py", line 549, in train output = self.model(data, label, name) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply raise output File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker output = module(*input, *kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(input, kwargs) File "/home/nada/Graph/STTR_selection/net/st_gcn.py", line 268, in forward x = self.data_bn(x) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 83, in forward exponential_average_factor, self.eps) File "/home/nada/anaconda3/envs/nada2/lib/python3.7/site-packages/torch/nn/functional.py", line 1697, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled RuntimeError: running_mean should contain 150 elements not 300

Dear Chiara, I have faced this problem while running your code under this configuration on the x-view, I hope you could check your provided files and please let me know where is the problem in here. thank you!

model channels should be 3 for only joints data and 6 for mixed bones data

— Reply to this email directly, view it on GitHub https://github.com/Chiaraplizz/ST-TR/issues/30#issuecomment-1059654300, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQGZINEQ2FIQ3H4DRE52RUTU6LAI5ANCNFSM5EKGFYGA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>