eth-ait / aitviewer

A set of tools to visualize and interact with sequences of 3D data.
MIT License
497 stars 46 forks source link

Some problems about load_AMASS #35

Closed huzijun1996 closed 1 year ago

huzijun1996 commented 1 year ago

I found a problem after running the three programs load_AMASS.py, load_DIP_TC.py and load_DIP_IMU.py. Unlike the situation that the character models in load_DIP_TC.py and load_DIP_IMU.py are facing me, the models in load_AMASS.py in the initial stage are sideways to me, how can I improve this situation? What should be done so that the data in the amass dataset is displayed right to me as the mp4 file of renders in the website?Meanwhile, after deploying the actions in the DIP-IMU dataset in Unity first, then putting the actions in the AMASS dataset into Unity will result in a flipped situation. What should I do to align the AMASS global framework with DIP? If I use AMASS dataset for training and DIP-IMU dataset for testing,what should I do with the AMASS dataset?

huzijun1996 commented 1 year ago

Due to the problems encountered in unity and aitviewer visualization, I would like to know how you preprocess the AMASS dataset in the Deep Inertial Attitude (DIP) project? If the DIP and AMASS datasets cannot be in the same global framework, I am afraid it will affect the training and testing of the neural network.

kaufManu commented 1 year ago

Hi,

In the DIP model everything is root-relative, as long as you follow this convention both during training and testing, there shouldn't be a problem.

For the visualization of DIP-IMU/AMASS in aitviewer and Unity:

Closing this issue - feel free to re-open it if you have more questions.

huzijun1996 commented 1 year ago

Hi,

In the DIP model everything is root-relative, as long as you follow this convention both during training and testing, there shouldn't be a problem.

For the visualization of DIP-IMU/AMASS in aitviewer and Unity:

* The data in DIP-IMU should be in right-handed coordinate system where y is up (same as the SMPL body coordinate system and same as the aitviewer's origin).

* In AMASS it appears that z is up (at least for the sequences that I looked at, I'm not sure this holds for the entire dataset). When you load an AMASS sequence with the `SMPLSequence.from_amass` method, the sequence is automatically rotated such that the y is up, so that the sequence appears to be up in the aitviewer coordinate system. This is achieved with a rotation `self.rotation = np.matmul(np.array([[1, 0, 0], [0, 0, 1], [0, -1, 0]]), self.rotation)` in the initializer of a `SMPLSequence`.

* AFAIR Unity uses a left-handed (!) coordinate system where y is up, which means that in order to display SMPL sequences in Unity, you will have to change to a left-handed coordinate system. You will have to account for this in Unity, though. In my own experience with Unity, it is best to always keep the data in the original coordinate system (whatever that is) for all the computations, and then only for the visualization as the very last step you account for the left-handedness of Unity (flipping the x-axis). You can check this in the DIP project, e.g. [here](https://github.com/eth-ait/dip18/blob/eadd5f60db8975d499651d34268ba97b5282dccc/live_demo/SMPLPoseUpdater.cs#L82).

Closing this issue - feel free to re-open it if you have more questions.

Thank you for your answer. I have preprocessed the ACCAD/Male2General_c3d/A6- Box lift_poses.npz in the AMASS dataset according to your rotation method. I found that after first deploying the actions from the DIP-IMU dataset in Unity, and then putting the processed actions from the AMASS dataset into Unity(video 1), it flips compared to the official AMASS provided render(video 2). However, if I modify the rotation, it becomes correct(video 3). This is achieved with a rotation self.rotation = np.matmul(np.array([[-1, 0, 0], [0, 0, 1], [0, 1, 0]]), self.rotation) in the initializer of a SMPLSequence. Can you tell me if my rotation matrix is different from what you propose because of the difference between left-handed and right-handed coordinate systems? Do these two different spin variations affect training and testing?

https://github.com/eth-ait/aitviewer/assets/56211942/be27609c-76e6-4d68-a70c-f13e183dfa20

https://github.com/eth-ait/aitviewer/assets/56211942/d1f9181d-be12-439a-bb98-f575224d65fd

https://github.com/eth-ait/aitviewer/assets/56211942/d1f5bb48-be2a-495c-a738-3c195ab67944

kaufManu commented 1 year ago

I'm not sure I understand or if my assumptions are wrong. What is the difference between wrong0.mp4 and right.mp4? Is self.rotation already applied in wrong0.mp4? That would make sense because self.rotation rotates the sequence so that y is up and Unity uses y up. I would expect that you right.mp4 is obtained after additionally flipping the x axis because Unity is left-handed.

Regarding training and testing of DIP: DIP normalizes its inputs to be relative to the root IMU, in which case the different conventions shouldn't affect training/testing. However, if your model deviates from that, you will have to account for it, though.

huzijun1996 commented 1 year ago

I'm not sure I understand or if my assumptions are wrong. What is the difference between wrong0.mp4 and right.mp4? Is self.rotation already applied in wrong0.mp4? That would make sense because self.rotation rotates the sequence so that y is up and Unity uses y up. I would expect that you right.mp4 is obtained after additionally flipping the x axis because Unity is left-handed.

Regarding training and testing of DIP: DIP normalizes its inputs to be relative to the root IMU, in which case the different conventions shouldn't affect training/testing. However, if your model deviates from that, you will have to account for it, though.

Yes, your understanding is exactly correct, self.rotation has been applied to wrong0.mp4, otherwise the direct display would be original.mp4. right.mp4 has undergone an extra flip, multiplying the extra matrix at wrong0.mp4. May I ask what you mean by 'conventions'? Is it to say that the different coordinate systems do not affect the training and testing of the DIP and AMASS datasets just their viewing? Regardless of the coordinate system in wrong0.mp4, original.mp4, and right.mp4, they can be trained and tested properly?

https://github.com/eth-ait/aitviewer/assets/56211942/b70c09aa-0d7a-4589-93fb-c56660e26169

kaufManu commented 1 year ago

Ah okay, yeah then everthing makes sense.

May I ask what you mean by 'conventions'? Is it to say that the different coordinate systems do not affect the training and testing of the DIP and AMASS datasets just their viewing? Regardless of the coordinate system in wrong0.mp4, original.mp4, and right.mp4, they can be trained and tested properly?

I should be more precise, sorry:

As a side note: Note that when we apply self.rotation in the aitviewer to a SMPL sequence, we apply it to the vertices of SMPL (self.rotation will be factored into the model matrix, which is pushed to a shader as part of the model-view-projection matrix). Importantly, applying self.rotation to the vertices is not the same as applying self.rotation to the root orientation of SMPL (i.e. the 0-th joint angle in the SMPL parameters, see issue #5 for a clarification).

huzijun1996 commented 1 year ago

Ah okay, yeah then everthing makes sense.

May I ask what you mean by 'conventions'? Is it to say that the different coordinate systems do not affect the training and testing of the DIP and AMASS datasets just their viewing? Regardless of the coordinate system in wrong0.mp4, original.mp4, and right.mp4, they can be trained and tested properly?

I should be more precise, sorry:

* As the inputs to DIP are relative to the root IMU, this is independent of coordinate system conventions on the outputs (i.e. SMPL). But the root IMU should be consistent across all training and test data.

* For the outputs, I would pick a coordinate system convention and use only that one during training. It probably makes sense to pick the one that you use for synthetic data generation, i.e. AMASS.

* Then at test time, the predicted SMPL poses will be in whatever coordinate system you picked during training.

* All of this is independent of the visualization: depending on which one you pick, the transformations to make the visualization look "correct", might be slightly different. If you pick the AMASS coordinate system, then what you described above should work to visualize both training and test data.

As a side note: Note that when we apply self.rotation in the aitviewer to a SMPL sequence, we apply it to the vertices of SMPL (self.rotation will be factored into the model matrix, which is pushed to a shader as part of the model-view-projection matrix). Importantly, applying self.rotation to the vertices is not the same as applying self.rotation to the root orientation of SMPL (i.e. the 0-th joint angle in the SMPL parameters, see this issue for a clarifcation).

Thank you for your answer. There are a few more questions in response to your answer:

  1. This means that due to root IMU, training and testing of DIP and AMASS datasets can only be done on their own datasets, independent of the final visualization, right?
  2. Does that mean it doesn't make much sense to do self.rotation for AMASS, I just need to improve the axes when visualizing?