DeepMotionEditing / deep-motion-editing

An end-to-end library for editing and rendering motion of 3D characters with deep learning [SIGGRAPH 2020]

BSD 2-Clause "Simplified" License

1.56k stars 256 forks source link

Video style transfer #144

Closed Hellodan-77 closed 3 years ago

Hellodan-77 commented 3 years ago

Hello! I would like to ask in "style_transfer", in the file utils/animation_2d_data.py, in from_openpose_json,

trans_motion[:, 12, :] = (motion[:, 15, :] + motion[:, 16, :]) / 2.0
trans_motion[:, 16, :] = motion[:, 35, :] # 25 + 10
trans_motion[:, 20, :] = motion[:, 56, :] # 25 + 21 + 10

     trans_motion[:, 9, :] = (trans_motion[:, 0, :] + trans_motion[:, 10, :]) / 2
     trans_motion[:, 11, :] = (trans_motion[:, 10, :] + trans_motion[:, 12, :]) / 2

What does this part of the code mean? What principle do you use to map the 2D joint points extracted from the video with the CMU skeleton?

HalfSummer11 commented 3 years ago

Since the direct output skeleton of OpenPose is different from the CMU skeleton, here we use a heuristic for the mapping. There is no real "principle" - it's more based on the visual effect of the output. Also, we try to keep the mapping as simple as possible.

Hellodan-77 commented 3 years ago

In the code file, joint_map has joint_map = { 0: 8, 1: 12, 2: 13, 3: 14, 4: 19, 5: 9, 6: 10, 7: 11, 8: 22,

9 is somewhere between 0 & 10

         10: 1,
         # 11 is somewhere between 10 and 12
         12: 0,
         13: 5, 14: 6, 15: 7, # 16 is a little bit further
         17: 2, 18: 3, 19: 4, # 20 is a little bit further
     }

For example, before and after the colon: (0:8) If it is x:y, what does each corresponding x and y represent?

HalfSummer11 commented 3 years ago

Here x represents a joint index in the CMU skeleton, and y represents the corresponding joint index in the raw output from OpenPose. (see line 93)

Hellodan-77 commented 3 years ago

Hello, the code from line 93 to line 109 you mentioned is still a bit unreadable after I read it for a long time. Can you explain it to me in detail? such as:

 trans_motion[:, 12, :] = (motion[:, 15, :] + motion[:, 16, :]) / 2.0
         trans_motion[:, 16, :] = motion[:, 35, :] # 25 + 10
         trans_motion[:, 20, :] = motion[:, 56, :] # 25 + 21 + 10

         trans_motion[:, 9, :] = (trans_motion[:, 0, :] + trans_motion[:, 10, :]) / 2
         trans_motion[:, 11, :] = (trans_motion[:, 10, :] + trans_motion[:, 12, :]) / 2

         motion = trans_motion
         motion[:, :, 1] = -motion[:, :, 1] # upside-down
         motion[:, :, :] -= motion[0:1, 0:1, :] # start from zero

I don't know what each means, for example: What does motion[:, :, 1] = -motion[:, :, 1] mean? And what does motion[:, :, :] -= motion[0:1, 0:1, :] mean?Why should we specify this way? Thank you very much!