Hello, You can use PyMo to convert BVH to 3D joint position data. And make datasets in LMDB format. This is make_lmdb script for TED dataset: for your reference; you need to make your own version.
Mean_pose is the mean of all 3d poses in the training set. Mean_dir_vec is the mean of all directional vectors, which is the output of convert_pose_seq_to_dir_vec
Hello! Could you please send me the code contains AudioWrapper? I use VideoPose3D to convert 2D poses into 3D poses successfully,but I still need to extract audio feature to finish making train dataset. Or Could you publish the entire 3D preprocessing code? I want to produce chinese dataset reference the data process. Thank you very much!
This is AudioWrapper which was missing in the above code. There are audio features (melspectrogram features, specifically), but I never used them in this study. The audio encoder network got raw audio signals as input.
class AudioWrapper:
def __init__(self, filepath):
self.y, = librosa.load(filepath, mono=True, sr=16000, res_type='kaiser_fast')
self.n = len(self.y)
def extract_audio_feat(self, video_total_frames, video_start_frame, video_end_frame):
# roi
start_frame = math.floor(video_start_frame / video_total_frames * self.n)
end_frame = math.ceil(video_end_frame / video_total_frames * self.n)
y_roi = self.y[start_frame:end_frame]
# feature extraction
melspec = librosa.feature.melspectrogram(
y=y_roi,, n_fft=1024, hop_length=512, power=2)
log_melspec = librosa.power_to_db(melspec, ref=np.max) # mels x time
log_melspec = log_melspec.astype('float16')
y_roi = y_roi.astype('float16')
# print('spectrogram shape: ', log_melspec.shape)
return log_melspec, y_roi
Thank you very much!
How to get the the 3d joint positions of each frame "pose_seq" and the vectors of bones "vec_seq" from .bvh? And how to get the mean_dir_vec and mean_pose in config, there seem only data_mean and data_std in process on make LMDB?