Closed Dai-Wenxun closed 1 year ago
Thanks for your suggestion. At the beginning of our implementation, we considered how to use SMPL-X more conveniently, while considering efficiency less. In fact, we decide to implement a smpl-kit to use SMPL more easily and efficiently. If you are interested in this project, we can contribute to this repo together.
The bottleneck is in the SMPLX_Util.get_body_vertices_sequence, since it will load the smplx pretrained weights repeatedly. For example, there are 1319 examples of action
walk
, and thek == 10
, then the number of times of the loading process will be 13190, making the IO time extremely long. My suggestion is: simply instance 1 smplx model withbatch_size=max_motion_len
, and select the unmasked smpl parameters after the inference of smplx model.Here is my code to test the time cost of three modes:
When
test_mode = 'cpu'
: [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 100/100, 1.7 task/s, elapsed: 58s, ETA: 0s Whentest_mode = 'cuda'
: [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 100/100, 1.7 task/s, elapsed: 60s, ETA: 0s Whentest_mode = 'cuda_static'
: [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 100/100, 40.2 task/s, elapsed: 2s, ETA: 0s 40x faster when theseq_len=60
, half of themax_motion_len
.