simonalexanderson / ListenDenoiseAction

Code to reproduce the results for our SIGGRAPH 2023 paper "Listen Denoise Action"
Other
160 stars 22 forks source link

How could we preprocess the raw dataset from ".wav"&".bvh" files to ".pkl"? #17

Open DarLikeStudy opened 3 months ago

DarLikeStudy commented 3 months ago

Thanks for your work! It is an outstanding work for dance motion generation. I read your "LDA" paper over and over again. And I really hope to know your methods to preprocess the raw data by Madmom and Pymo. I am confused about the preprocessing for music features named Spectralflux, Chroma, Beat and Beatactivation and motion features with sklearn.pipeline way from .bvh to MocapPara and so on. Could you share the preprocessing scripts for the dataset? Sorry for any possible trouble caused.

AbelDoc commented 3 months ago

+1 Although I'm not 100% sure, from reading the code digging into the .sav containing the data processing pipeline used to train the dance model, I came up with some information that could be useful to you. They use sklearn.pipeline to process their data, which you can find an example in one of their other repo that I used: https://github.com/simonalexanderson/StyleGestures/blob/master/data_processing/prepare_gesture_datasets.py From reading the .sav file I found that they are using the following pipeline for motion data: Pipeline([ "jtsel": JointSelector(joints=["Spine", "Spine1", "Neck", "Head", "RightShoulder", "RightArm", "RightForeArm", "RightHand", "LeftShoulder", "LeftArm", "LeftForeArm", "LeftHand", "RightUpLeg", "RightLeg", "RightFoot", "LeftUpLeg", "LeftLeg", "LeftFoot"], include_root=True), "root": RootTransformer(method="pos_rot_deltas", position_smoothing=..., rotation_smoothing=...), "feats": MocapParameterizer(param_type="expmap", ref_pose=...), "cnst": ConstantRemover(), "cnt": FeatureCounter(), "npy": Numpyfier(), ]) Field with '...' indicate that I was not able to know the values here, however they are different from the defaults one of the code for sure otherwise the field would not be part of the .sav file. The output will be a numpy array, so I think you just need to use pickle to save that into a .pkl file and you should good to go. For audio features I cannot help, for my application I will just go for MFCCs features, but anyway again from reading the code you need to save your features into a panda dataframe with columns with the feature names (you can find examples of the columns names in the .txt files in the data folder), and the values being your features, save this dataframe into a .pkl file and it should also be good. I will try to train the network using those information, I will edit this message if it goes well.

Hope it helps !

DarLikeStudy commented 3 months ago

+1 Although I'm not 100% sure, from reading the code digging into the .sav containing the data processing pipeline used to train the dance model, I came up with some information that could be useful to you. They use sklearn.pipeline to process their data, which you can find an example in one of their other repo that I used: https://github.com/simonalexanderson/StyleGestures/blob/master/data_processing/prepare_gesture_datasets.py From reading the .sav file I found that they are using the following pipeline for motion data: Pipeline([ "jtsel": JointSelector(joints=["Spine", "Spine1", "Neck", "Head", "RightShoulder", "RightArm", "RightForeArm", "RightHand", "LeftShoulder", "LeftArm", "LeftForeArm", "LeftHand", "RightUpLeg", "RightLeg", "RightFoot", "LeftUpLeg", "LeftLeg", "LeftFoot"], include_root=True), "root": RootTransformer(method="pos_rot_deltas", position_smoothing=..., rotation_smoothing=...), "feats": MocapParameterizer(param_type="expmap", ref_pose=...), "cnst": ConstantRemover(), "cnt": FeatureCounter(), "npy": Numpyfier(), ]) Field with '...' indicate that I was not able to know the values here, however they are different from the defaults one of the code for sure otherwise the field would not be part of the .sav file. The output will be a numpy array, so I think you just need to use pickle to save that into a .pkl file and you should good to go. For audio features I cannot help, for my application I will just go for MFCCs features, but anyway again from reading the code you need to save your features into a panda dataframe with columns with the feature names (you can find examples of the columns names in the .txt files in the data folder), and the values being your features, save this dataframe into a .pkl file and it should also be good. I will try to train the network using those information, I will edit this message if it goes well.

Hope it helps !

Thanks for the reply! I have already tied to use the sklearn.pipeline to preprocess the dataset. For LDA I tied the _data_pipe.expmap30fps.sav on the original _motoricadance dataset, through the _[prepare_gesturedatasets.py] way. If I use my own JointSelector, how to name the index? It is confusing for me about the way to name the index for the motion features column of the numpy array about _'Hips_Yposition', 'reference_dXposition', 'reference_dZposition','referencedYrotation' and so on.

AbelDoc commented 3 months ago

The JointSelector uses joints names directly so Hips, Spine, Spine1, ... Those other name are what is stored in the dataframe but the JointSelector will save all column with X_Y with X the joint name you gave.

I was sucessful in launching a training that did seems to converge well however the inference process was completely off and was instable with no proper results. Unfortunately I don't have more time to spend on it so I will have to exclude this model from my research :/

DarLikeStudy commented 3 months ago

I tied the .sav files to preprocess the data. During the model training, Epoch 0's synthesis period error: _synthesizen DataLoader 0: 0%| | 0/235 [00:00<?, ?it/s], AttributeError: 'Numpyfier' object has no attribute 'orgmocap'_. My .bvh data don't have same joint name as Motorica_dance so I can't use the original .sav in the dataset. It seems failed to decode the data from .pkl file type data to .bvh data.

AbelDoc commented 3 months ago

You cannot use their .sav file if you do not have the same joint names, you need to create your own pipeline with the Pipeline object I described above It also means that you need to create your own .pkl file The .sav file they provide can only work for those precise .pkl file thus only work for the Motorica dance dataset unfortunately

If you want to try use their network on other data you have to: Create a pipeline, process the data, save the pipeline as .sav, save the data as .pkl and then use their network

But even with that this network seems really hard to train from scratch :/

DarLikeStudy commented 3 months ago

Yes, I understand. I find the problem and created my own Pipeline to process my own motion data by using my own JointSelector, at which the error still occurred during training the model. I will try to edit the .bvh before dataset preprocess. I will also consider other dance dataset and projects as well.