Boosting Efficiency - Githubissues

ouusan commented 1 month ago

1.Revitalizing optimization for 3d human pose and shape estimation: A sparse constrained formulation(2021) code:No 2.Body meshes as points(2021) regared as a two class classification task(if a grid inculde person or not) and smpl regression task. (Resnet backbone +FPN neck+SMPL-head) code: https://github.com/jfzhang95/BMP 3.FeatER: An Efficient Network for Human Reconstruction via Feature Map-Based TransformER(2022) to preserve the inherent structure of 2D feature map representation (transformer will flatten before attention) Coarse feature maps-->FeatER->Refined feature maps(-->2d pose head)-->Feature Maps Masking(challenging due to occlusion)-->Masked feat maps-->FeatER-->Reconstructed feature maps--> 2D-3D lifting module(n,h,w to n,h,w,d)-->3d feat maps-->smpl regressor follow HybrIK (3d+mesh loss) code: https://github.com/zczcwh/POTTER/tree/main/human_mesh_recovery 4.POTTER: Pooling Attention Transformer for Efficient Human Mesh Recovery(2023) (same author with 3.) Pooling Attention to reduce the memory and computational burden without sacrificing performance+new architecture: Basic stream: High Resolution, Local-->LowResolution, Global HRstream: High Resolution, Local-->High Resolution, Local and Global. code: https://github.com/zczcwh/POTTER/tree/main/human_mesh_recovery 5.Tore: Token reduction for efficient human mesh recovery with transformer(2023) Geometry Token Reduction (GTR): only queries the body tokens, which is equivalent to the number of joints: light transformer_1/2/3 (in 1,2 only query cam_token, img_feat, joint tokens, query joints_feat, cam_feat, vertices tokens in 3) see details in https://github.com/Frank-ZY-Dou/TORE/blob/main/build/lib/metro/modeling/bert/modeling_metro.py#L137 Image Token Pruning (ITP): learn a mapping matrix(token heatmap) , that produces a clustering over origin tokens. https://github.com/Frank-ZY-Dou/TORE/blob/main/build/lib/metro/modeling/bert/modeling_metro.py#L118-L128 code: https://github.com/Frank-ZY-Dou/TORE

ouusan commented 1 month ago

2 class classification and smpl regression process: https://github.com/jfzhang95/BMP/blob/main/mmdetection/mmdet/models/smpl_heads/bmp_head.py#L193 pretrain in large dataset with 200 epoch, and finetune in specific dataset with 50 epoch,and smaller lr.
hybrik deatil https://github.com/Jeff-sjtu/HybrIK/blob/main/hybrik/models/layers/smpl/lbs.py#L291
PoolAttention: https://github.com/zczcwh/POTTER/blob/main/human_mesh_recovery/hybrik/models/pool/poolattnformer_HR.py#L118
FastMETRO_Body_Network: https://github.com/Frank-ZY-Dou/TORE/blob/main/tore/modeling_fm/bert/modeling_metro.py#L98

ouusan commented 1 month ago

3.related works: Efficient Methods for HPE and HMR computational complexity: https://github.com/ouusan/some-papers/issues/17 smpl regressor follow 3-18 Hybrik: https://arxiv.org/pdf/2011.14672 and code: https://github.com/Jeff-sjtu/HybrIK

4-41 MetaFormer Is Actually What You Need for Vision https://arxiv.org/pdf/2111.11418 and code: https://github.com/sail-sg/poolformer we noticed in pooling , https://github.com/sail-sg/poolformer/blob/main/models/poolformer.py#L115-L126 they returns self.pool(x)-x The operation self.pool(x) - x is a mechanism to retain high-frequency details lost during average pooling, enabling the model to preserve important fine details in the input while still benefiting from the pooling operation’s smoothing effect. it behaves similarly to a residual connection, which is commonly used to help preserve information and make learning more efficient. 5.related works: Token Reduction for Transformers Inspired by recent advances in token pruning [16, 59, 56, 47], propose image token pruning (ITP)

ouusan commented 14 hours ago

Feater : https://github.com/zczcwh/POTTER/blob/main/human_mesh_recovery/hybrik/models/hmvit/hmvit_block.py#L100 vit_mask: https://github.com/zczcwh/POTTER/blob/main/human_mesh_recovery/hybrik/models/HeaterWithCam.py#L72

ouusan / some-papers

Boosting Efficiency #28