ahmed-nady / Multimodal-Action-Recognition

0 stars 0 forks source link

can I get the complete code #1

Open Asia10086 opened 2 months ago

Asia10086 commented 2 months ago

Due to my limited ability, I cannot repeat the result

jstumpin commented 1 week ago

Due to my limited ability, I cannot repeat the result

+1

ahmed-nady commented 6 days ago

Thanks for your interest in our multimodal action recognition work. Of course, I will share the code here soon (within 2 weeks).

Also, Here are some hints that may help you: For the multimodal network, i performed the following:

a) preprocess the video clip and its corresponding skeletons to make them spatially aligned by cropping video clip frames according to the minimum bounding box involving all 2D poses of the subject. b) implement a dataset class for the multimodal network to feed the dataloader with RGB frames and skeleton heatmap volume. c) implement the multimodal network with a spatial-temporal attention block d) using the distributed training in Pytorch (DDP) to use multiple GPUs or multiple nodes.