NLA-ASL gen_pose.py: 'assert pose_results.shape == (batch_data['vlens'][0], 133, 3)' Assertion Error

fransisca25 commented 10 months ago

I encountered an assertion error:

Traceback (most recent call last):
  File "gen_pose.py", line 306, in <module>
    main()
  File "gen_pose.py", line 279, in main
    assert pose_results.shape == (batch_data['vlens'][0], 133, 3)
AssertionError

I have already verified that my batch_data['vlens'] is in the form of a list. The video lengths vary and I expect padding to be used when the frame length is less than 64. If the length exceeds 64, it should select 64 frames to match pose_results.shape.

Here is an example of the sample data where I attempted to print the shape:

BATCH DATA VLENS: [122, 98, 75, 61]
POSE RESULTS SHAPE:  (64, 133, 3)

The (batch_data['vlens'][0], 133, 3) does not match with pose_results.shape which triggered the assertion error.

Could you please guide me on where I should investigate to resolve this issue?

2000ZRL commented 9 months ago

When generating pose keypoints, we should use the original video length (don't pad) to make sure that keypoints are extracted for each frame.

fransisca25 commented 9 months ago

Thanks a lot!

FangyunWei / SLRT

NLA-ASL gen_pose.py: 'assert pose_results.shape == (batch_data['vlens'][0], 133, 3)' Assertion Error #37