Any example retarget SMPL to mixamo or metahuman?

lucasjinreal commented 1 year ago

hlcdyy commented 1 year ago

We currently don't have any examples available for converting SMPL to Mixamo or Metahuman, but you can use the ./data_preprocess/Mixamo/bvh_parser.py script to process AMASS data, and retarget the motion to Mixamo.

lucasjinreal commented 1 year ago

@hlcdyy If i wanna using retarget to metahuman skeleton, how can I gather the training data?

tshrjn commented 1 year ago

@hlcdyy The method you recommend has an issue. Im unable to load the model weights as it seems dependent on number of skeleton joints. The paper mentions we can use this method for unseen skeletons. How is that functionality achieved? Skeleton agnostic ie don't know it's topology before hand?

I get the following error, which occurs in loading the model:

  File "eval_single_pair.py", line 101, in <module>
    main()
  File "eval_single_pair.py", line 81, in main
    model.load(epoch=epoch)
  File "/path/pan-motion-retargeting/models/architecture_mixamo.py", line 304, in load
    model.load(os.path.join(self.model_save_dir, 'topology{}'.format(i)), epoch)
  File "/path/pan-motion-retargeting/models/Intergrated.py", line 201, in load
    self.load_network(self.auto_encoder, os.path.join(path, 'auto_encoder.pt'))
  File "/path/pan-motion-retargeting/models/Intergrated.py", line 219, in load_network
    network.load_state_dict(new_state_dict)
  File "/path/envs/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for MotionAE:
        size mismatch for enc.conv_residual.mask: copying a param with shape torch.Size([192, 92, 15]) from checkpoint, the shape in current model is torch.Size([192, 96, 15]).
        size mismatch for enc.conv_residual.weight: copying a param with shape torch.Size([192, 92, 15]) from checkpoint, the shape in current model is torch.Size([192, 96, 15]).
        size mismatch for dec.layers.1.1.mask: copying a param with shape torch.Size([92, 192, 15]) from checkpoint, the shape in current model is torch.Size([96, 192, 15]).
        size mismatch for dec.layers.1.1.weight: copying a param with shape torch.Size([92, 192, 15]) from checkpoint, the shape in current model is torch.Size([96, 192, 15]).
        size mismatch for dec.layers.1.1.bias: copying a param with shape torch.Size([92]) from checkpoint, the shape in current model is torch.Size([96]).

hlcdyy commented 1 year ago

@tshrjn In order to enable the encoders/decoders trained on the Mixamo skeletons to receive the articulated motions from different structures, we employed a joint mapping similar to the NKN approach. This ensures that unseen skeleton motion representations align with the input format of the neural network architecture. For more details, you can refer to Section 6.2.3 of our paper. Of course, if you have a portion of the motion data belonging to the skeleton that needs retargeting, you can reconstruct the corresponding encoder-decoder and retrain it.

hlcdyy commented 1 year ago

@lucasjinreal I cannot provide specific advice on how to access Metahuman motion data (possibly through downloading assets from the official website). Once you have got the motion data performed by the Metahuman skeleton, you will need to convert it to the BVH format first. Then, you can use the bvh_parser.py to preprocess the data.

tshrjn commented 1 year ago

It sounds like you manually mapped from 1 skeleton struct to another. Is there code reference for this in the repo?

Also, apologies for being beyond scope of this project, do you happen to know any or best ways to automatically map from a skeleton to another in zero shot fashion? Like Zero-shot pose transfer work but for skeletons?

hlcdyy commented 1 year ago

@tshrjn For the Zero-shot motion retargeting methods, I only know some traditionally IK-based approach like "Using an Intermediate Skeleton and Inverse Kinematics for Motion Retargeting". However, the IK-based methods should manually specify the joint correspondence. As far as I know, there is no automated and data-driven way to implement zero-shot skeleton mapping at the motion level.

hlcdyy / pan-motion-retargeting

Any example retarget SMPL to mixamo or metahuman? #2