Closed qinb closed 2 years ago
Thanks for the positive feedback!
Our approach takes 3D joint positions as input for ground reaction force estimation and foot contact detection, and footskate cleaning additionally requires joint angles (that can obtained from 3D joint positions through retargeting to any skeleton present in our database).
To handle video or images, our method must be adapted, e.g. by first obtaining 3D joint positions from video or images (e.g. with AlphaPose)
Thanks for your reply opportunely.
In the ground reaction force estimation and foot contact detection, must 3D joint positions retarget to any skeleton present in your database?
Actually, I obtain 3D joint positions by SMPL regressor. The difference between the standard SMPL and the database's skeleton is as follows.
Fig.1 is the standard SMPL skeleton.
Fig.2 is your database skeleton.
you say "that can obtained from 3D joint positions through retargeting to any skeleton present in our database".
In other words, should it retarget the SMPL skeleton(Fig.1) to the database's skeleton(Fig.2) in my description? if yes, do you have any suggestions to retarget?
Yes retargeting to a skeleton of our database will yield better results with the pretrained network since during training it has only seen the skeleton depicted in your Fig 2. Actually I did a few attempts on motions from AMASS to see how our method can generalise, and results of contact detection were visually pretty good (see below).
Retargeting in that very specific case is not very complicated since in the end the network will only use joint positions. I used to do a quick numerical optimization of joint angles (e.g. parametrized with quaternions) such that those joint angles applied to its skeleton (from our database) yield joint positions (computed via forward kinematics) close to input joint positions (e.g. SMPL skeleton).
Here are a few key points on how I was doing it:
Here are a few example of foot contact detection on examples from amass (original skeleton is drawn but retargeted motion is used for detection):
Thanks for your so detailed and patient guidance, I desire to utilize your approach which generalization is pretty great, but don't have the base skill of retargeting. Do you mind releasing the code that retargets the SMPL skeleton to your database's skeleton? if inconvenient, could you offer by email: qinbr@outlook.com
Sure, I added my prototype code to run contact detection on examples from AMASS in 'demo.py'. Look at functions 'retarget_to_underpressure' and 'contacts_detection_from_amass', as well as the 'if name == main' part at the end of file.
You will have to provide 3D joint positions as a single tensor of shape ... x T x J x 3 where T is the number of frame and J the number of joints, the framerate of the input joint positions and a skeleton for the retargeting (the first of our database by default).
Another important thing is the correspondance of joints between input and UnderPressure skeleton topology, which is defined in line 83 ('AMASS_JOINT_NAMES = ...'). 'retarget_to_underpressure' will be done according to joint names both present in 'AMASS_JOINT_NAMES' and UnderPressure joint names (see 'TOPOLOGY' in 'data.py'): distances between matching joints only will be minimized. You probably will have to edit 'AMASS_JOINT_NAMES' according to your skeleton topology.
Sorry, I reply late because the email doesn't remind me, and will try it. Thanks again for your patience and detailed guide.
@mourotlucas i have the same question, now i am try it. i have made the dataset with default dataset, when i run: ---- python demo.py contacts_from_amass, it shows me error path is '*', i also see the code, the code is to load the joints, can you give me a demo cmd?
@qinb are you sucessful with smpl ? when i run: python demo.py contacts_from_amass, i need to add the args.path: local points_file_path with smpl24 points ?
@qinb @mourotlucas the step is: s1, get the smpl 24 3Dpoints from video, s2, get the angles and trajectory by contacts_from_amass in demo.py s3, get the result by cleanup in demo.py with args from s2 so it is right ?
@qinb are you sucessful with smpl ? for help!
argument -path shouldn't have default value '*', I removed it. The command should be 'python demo.py contacts_from_amass -path some_path' where 'some_path' points to a file containing 3D joint positions (saved with pytorch, i.e. torch.save, since demo.py use torch.load at line 179)
As explained above, 'contacts_detection_from_amass' method in file 'demo.py' is a prototype code to run contact detection on examples from AMASS, and 'AMASS_JOINT_NAMES' might have to be edited according to the joints provided.
A possible workflow to estimates joint positions from video and clean footskate as a post-processing step following our method could be the following:
@mourotlucas
in func: retarget_to_underpressure,
Q1: about args , we need to get the default 23 skeleton by TOPOLOGY from ours AMASS_joint_nam(maybe len(AMASS_joint_nam)>23), so len(joint_position_we_need)==23, the to run retarget ?
Q2: if i want to retarget to my skeleton from the default 23 skeleton, i need to change the TOPOLOGY and joint_position
Q3: the more the better about niters ?
can you share your test joint data file about amass.mp4 ?
Q1: Since our deep neural network has been trained on our dataset (containing foot contact ground truth), it requires UnderPressure skeleton topology as input. To adapt to other skeleton topologies (e.g. AMASS skeleton topology), motion data must be first retargeted from input skeleton topology (e.g. AMASS skeleton topology) to UnderPressure skeleton topology, which is what method 'retarget_to_underpressure' attempts to do. Given input joint positions 'joint_positions', this method optimize joint angles such that joint positions computed by forward kinematic with the given skeleton 'skeleton' best matches input joint positions. To do so, differences between matching pairs of joints are minimized. Argument 'joint_names' parametrize (demo.py, lines 43 to 49) which input joint must be paired with which UnderPressure skeleton joints (which are defined in data.py, lines 19 to 43).
Q2: As explained in previous comment (step 2), you need to retarget estimated 3D joint positions from AMASS skeleton topology to UnderPressure skeleton topology so that foot contact can then be detected using our deep network (see above). Depending on which joints are first estimated, you will have to edit 'joint_names' argument (set by default to AMASS_JOINT_NAMES in demo.py) such that 1) joint names order matches joint positions order in argument 'joint_positions' and 2) estimated joints corresponding to UnderPressure skeleton joints are accordingly named (see above and data.py lines 19 to 43). Only joints with matching names in 'joint_names' and UnderPressure skeleton joints will be used in the optimization loop of 'retarget_to_underpressure'.
Q3: As observed empirically, no. retargeting is a difficult task consisting in transfering motion from a source skeleton to a different target skeleton, which is not well defined. The optimization done in 'retarget_to_underpressure' minimizing joint position differences is a naive implementation of retargeting and hence introduce artifacts if joint position differences are to low. A limited number of iterations allows to stop the optimization before artifacts appear. There is no good value for this parameter in the general case, so that you might have to tune it. We empirically found the default value we set in demo.py (150 iterations) to be good for a few samples from AMASS.
What amass.mp4 ? If you mean data samples from AMASS, unfortunately no. Please see #5
@mourotlucas i have see many steps in code, when i run func cleanup in demo, and prepare to show it in blender. i have a question, the result : angles, skeleton, trajectory = cleaner(item["angles"], item["skeleton"], item["trajectory"]) Q1: angles is in world space or in skeleton space ? if in world, how to translate to skeleton space with parent-children ? Q2: trajectory is in world ? i notice when i need to show the result , it will do some calculation with ground value, then show result by trajectory an ground value . Q2.1 : the trajectory value is pelvis position ? or the whole body root position. i find it is too large in Z direction Q3: the order of angles is the same as the default 23 skeletons ? Q4: in MocapAndvGRFsApp(), i notice it needs global_position, global_trajectory, and get a new angle by two farmes angles in Skeleton.update(), how to do it detailly ?, i can't find where to get the args: frame, the use these in def update(self, frame: int, prev_frame: int, next_frame: int, dframe: float)
Q1: both angles and skeleton are expressed w.r.t. the global frame which seems to be what you call world space. If I understand well, what you call skeleton space is expressing joint angles w.r.t. to their parent joints. To translate you must express each global joint orientation with respect to the global orientation of its parent (like expressing a position w.r.t. another but with 3D rotations which are not commutative so be careful) Q2: yes Q2.1: not exactly, in our implementation trajectory is simply added to all joint positions after FK. The reason here is that it allows to have a single (static) skeleton for an entire sequence stored separately from a temporally changing global position (=trajectory).
ps: pose representations in character animation are a bit involved to work with, especially when going down into the details. If you are not already familiar with them I suggest you take a look at the section 2.1 of our survey paper https://arxiv.org/pdf/2110.06901.pdf, and if this is still too involved you might want to follow some course or textbook on character animation and/or representations of 3D rotations.
@mourotlucas https://github.com/InterDigitalInc/UnderPressure/issues/3#issuecomment-1372000266 about Q1, have you some advices ? if translate the result from global space to skeleton space , that will be more useful in use. and it is better in after steps.
other trys: if we use 3d key points in camera coordinate space to do cleanup ?
If your are not comfortable with implementing it, check out existing animation frameworks e.g. pynimation.
Our network has been trained in global world coordinate space, so there is no reason foot contact detection works well in camera coordinate space. Same for our proposed cleanup approach since it relies on our network at each optimization iteration.
i have see pynimation doc for several hours, the human.fbx base pose is T-pose. this is my code to trans world to skeleton space about angles:
import numpy as np
from pynimation.anim.animation import Animation
from pynimation.common import data, vec
angles, skeleton, trajectory = cleaner(item["angles"], skeleton, item["trajectory"])
index = [0, 9, 10, 11, 12, 21, 22, 17, 18, 19, 20, 13, 14, 15, 16, 1, 2, 3, 4, 5, 6, 7, 8]
fbx_file = data.getDataPath(r'data/animations/human.fbx')
animation = Animation.load(fbx_file) # the default sort after loading is changed, use index to make it right
# global Quat
gs = animation.globals[:, index]
for angle, g in zip(angle, gs):
# only change the body direction
g.root.rotation.quat=vec.Vec4(angle[0])
animation.save('rootQuat.fbx')
================================== while when i open the rootQuat.fbx, this is nothing changed ? can you give me some advices ?
some others animation frameworks ?
@mourotlucas maybe is that: s1, get result angles, skeleton, trajectory by cleaner(item["angles"], skeleton, item["trajectory"]) s2, get skeleton keypoint global position by anim.FK(angles, skeleton, trajectory, TOPOLOGY) s3, set global position in pynimation, then get local pose is right ?
Sorry but I don't know much about pynimation, I can't help here.
Thanks for sharing your excellent research. I wonder whether this method can infer in the 2d custom video or images. if yes, please offer some instructions. @salvetjx @damienreyt @mourotlucas