About Training Data Problem

amituofo1996 commented 2 years ago

Hi, Great work! When I prepare data and start trainning , one problem happen.

(human_motion_prior) xiaofeng@xiaofeng-Precision-Tower-7910:/media/xiaofeng/E/code/human-motion-prior-main/human_motion_prior/train$ sh run_script.sh 1 tensorboard --logdir=logs/motion_prior/summaries Torch Version: 1.1.0 1 CUDAs available! Training with cuda [TITAN Xp] Base dataset_dir is ../data/preprocess/amass_smpl_30fps_128frame.npz self.data.shape[0] 35429 self.p3ds.shape[0] 35338 Traceback (most recent call last): File "train_motion_prior.py", line 18, in run_motion_prior_trainer(ps) File "/media/xiaofeng/E/code/human-motion-prior-main/human_motion_prior/train/motion_prior.py", line 651, in run_motion_prior_trainer vp_trainer = MotionPriorTrainer(ps.work_dir, ps) File "/media/xiaofeng/E/code/human-motion-prior-main/human_motion_prior/train/motion_prior.py", line 307, in init ds_train = AMASSSeqDataset(data_dir=ps.dataset_dir, is_train=True) File "/media/xiaofeng/E/code/human-motion-prior-main/human_motion_prior/data/seq_dataloader.py", line 20, in init assert self.data.shape[0] == self.p3ds.shape[0], 'training data length should be same to the p3ds'

AssertionError: training data length should be same to the p3ds

Traceback (most recent call last): File "/home/xiaofeng/anaconda3/envs/human_motion_prior/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/xiaofeng/anaconda3/envs/human_motion_prior/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/xiaofeng/anaconda3/envs/human_motion_prior/lib/python3.6/site-packages/torch/distributed/launch.py", line 235, in main() File "/home/xiaofeng/anaconda3/envs/human_motion_prior/lib/python3.6/site-packages/torch/distributed/launch.py", line 231, in main cmd=process.args) subprocess.CalledProcessError: Command '['/home/xiaofeng/anaconda3/envs/human_motion_prior/bin/python', '-u', 'train_motion_prior.py', '--local_rank=0']' returned non-zero exit status 1.

It seems pose data length is different with joint data length. how can i fix this problem,Thanks

amituofo1996 commented 2 years ago

now self.data.shape[0] is 35362 self.p3ds.shape[0] is 35338

JchenXu commented 2 years ago

This is probably because that AMASS have updated their data. You can generate the p3ds data on the fly by uncomment this line #445 and save them without shuffling for further use.

JchenXu / human-motion-prior

About Training Data Problem #1

AssertionError: training data length should be same to the p3ds