ouusan / some-papers

0 stars 0 forks source link

Utilizing Attention Mechanism #24

Open ouusan opened 6 days ago

ouusan commented 6 days ago

1.PARE: Part Attention Regressor for 3D Human Body Estimation(2021) Part branch: J part attention and 1 background masks+Feature branch. code:https://github.com/mkocabas/PARE 2.Mesh graphormer(2021) based on 5/7, transformer+GCNN=graphormer(Graph Residual Block) code:https://github.com/microsoft/MeshGraphormer 3.Capturing humans in motion: temporal-attentive 3d human pose and shape estimation from monocular video(2022) MoCA Map( NSSM as the a priori+attention map) to extend the non-local operation, HAFI module(use past and future frame to refine temporal feature) code:https://github.com/MPS-Net/MPS-Net_release 4.Psvt: End-to-end multi-person 3d pose and shape estimation with progressive video transformers(2023) splits the pose decoder and shape decoder, pose-guided shape attention code:No 5.Cross-attention of disentangled modalities for 3d human mesh recovery with transformers(2022) progressive dimensionality reduction architecture. camera feature(seperate) not fed to decoder. learn non-local joint-vertex relations and local vertex-vertex relations by mask(??), Joint Tokens and Vertex Tokens. code:https://github.com/postech-ami/FastMETRO 6.3d human mesh reconstruction by learning to sample joint adaptive tokens for transformers(2022) (attached in email)

code:No 7.End-to-end human pose and mesh reconstruction with transformers(2021) (same to 5.) perform position encoding by adding a template human mesh to the image feature vector(concat joints and vertexs), progressive dimensionality reduction architecture. code:https://github.com/microsoft/MeshTransformer.

ouusan commented 11 hours ago

0.additional: https://arxiv.org/pdf/2105.01601 MLP-Mixer: An all-MLP Architecture for Vision https://github.com/google-research/vision_transformer 1.how to make Occlusion sensitivity mesh(??) 2.related: 2-5 Pose2mesh: Graph convolutional network for 3d human pose and mesh recovery from a 2d human pose(2020) 2-20 Convolutional mesh regression for single-image human shape reconstruction(2019) 2-22 End-to-end human pose and mesh reconstruction with transformers 7. 3.non-local operation3-38 Non-local neural networks, MoCA map

  1. mask attentions using the adjacency matrix obtained from the human triangle mesh of SMPL(???)