dluvizon / scene-aware-3d-multi-human

Source code of the paper Scene-Aware 3D Multi-Human Motion Capture, EUROGRAPHICS 2023
https://vcai.mpi-inf.mpg.de/projects/scene-aware-3d-multi-human/
Other
118 stars 17 forks source link

IndexError: tensors used as indices must be long, byte or bool tensors #10

Open Seeseallllll opened 9 months ago

Seeseallllll commented 9 months ago

Hi. when I

python -m mhmocap.predict_mupots --configs_yml D:\Seesea\human\sa-3d-mh\configs\predict_mupots.yml --ts_id 1 --num_iter 100 --output_path D:\Seesea\human\sa-3d-mh\data\mupots-3d-eval\TS1\output\result_output

i got this error:

Info: writing output to D:\Seesea\human\sa-3d-mh\data\mupots-3d-eval\TS1\output\result_output\TS1 DEBUG:: joint_confidence_thr>> 0.5 DEBUG:: erode_segmentation_iters>> 0 DEBUG:: erode_backmask_iters>> 5 DEBUG:: renormalize_depth>> True DEBUG:: post_process_depth>> True DEBUG:: H3DHCustomSequenceData DEBUG:: erode_segmentation_iters 0 DEBUG:: erode_backmask_iters 5 DEBUG:: use_hrnet_pose False DEBUG:: joint_coef_thr 0.5 DEBUG:: max_num_people None Images_path: ./data/mupots-3d-eval/TS1\images Image data: (201, 256, 256, 3) 0 255 Depth data: (201, 256, 256) 0.0 1.0 Segmentation data: (201, 256, 256) 0 4 Background mask data: (201, 256, 256) 0 1 ROMP predictions: 201 dict_keys(['cam', 'poses', 'betas']) Found 201 images with predictions from AlphaPose with idx: [1, 2, 3, 4, 5, 6] AlphaPose:: found max 4 predictions per frame from AlphaPose! AlphaPose data: (201, 4, 17, 3) DEBUG:: pvis [1. 1. 1. 0.] threshold is 0.125 ROMP predictions (final): 201 dict_keys(['cam', 'poses', 'betas', 'valid']) Filtering 2D poses with One-Euro filter. DEBUG:: H3DHCustomSequenceData: using cam: {'K': array([[1.8784500e+02, 0.0000000e+00, 1.2877887e+02], [0.0000000e+00, 1.8923062e+02, 1.2986162e+02], [0.0000000e+00, 0.0000000e+00, 1.2500000e-01]], dtype=float32), 'fov': 68.54204142245207, 'Kd': None, 'image_size': (256, 256)} 0%| | 0/201 [00:00<?, ?it/s] 0%| | 0/100 [00:09<?, ?it/s] Traceback (most recent call last): File "D:\Anaconda3\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "D:\Anaconda3\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "D:\Seesea\human\sa-3d-mh\mhmocap\predict_mupots.py", line 100, in log = predictor.run() File "D:\Seesea\human\sa-3d-mh\mhmocap\predict.py", line 343, in run log = self.optim_smpl.fit(self.dataloader, num_iter=self.num_iter, verbose=True) File "D:\Seesea\human\sa-3d-mh\mhmocap\optimizer.py", line 401, in fit idx_var = self.eval_batch_optimized_variables(idx_data['idxs']) File "D:\Seesea\human\sa-3d-mh\mhmocap\optimizer.py", line 683, in eval_batch_optimized_variables idx_var['min_z'] = softplus(self.zmin_lin[idxs]) IndexError: tensors used as indices must be long, byte or bool tensors

Then I try to change the code in optimizer.py,like this

def __eval_batch_optimized_variables(self, idxs):
    idx_var = {} # Indexed variables
    batch_size = idxs.shape[0]
    idx_var['scale_factor'] = torch.pow(1.1, self.xscale_factor)

    #修改idx_var['min_z'] = softplus(self.zmin_lin[idxs])
    idx_var['min_z'] = softplus(self.zmin_lin[idxs.long()])

    idx_var['max_z'] = (
        idx_var['min_z'].detach().clone()
        + self.min_delta_z
        #修改+ softplus(self.zmax_lin[idxs])
        + softplus(self.zmax_lin[idxs.long()])
     ) # (batch, 1, 1)

    #修改idx_var['poses_smpl'] = self.poses_smpl[idxs].view(-1, 72)
    idx_var['poses_smpl'] = self.poses_smpl[idxs.long()].view(-1, 72)
    idx_var['betas_smpl'] = self.betas_smpl.tile((batch_size, 1, 1)).view(-1, 10)
    #修改idx_var['valid_smpl'] = self.valid_smpl[idxs]
    idx_var['valid_smpl'] = self.valid_smpl[idxs.long()]

    results = self.SMPLPY(betas=idx_var['betas_smpl'], poses=idx_var['poses_smpl'])
    verts = results['verts'].view(batch_size, self.num_people, -1, 3)
    joints_smpl24 = results[self.smpl_sparse_joints_key].view(batch_size, self.num_people, -1, 3)

    idx_var['poses_smpl'] = idx_var['poses_smpl'].view(batch_size, self.num_people, 72)
    idx_var['betas_smpl'] = idx_var['betas_smpl'].view(batch_size, self.num_people, 10)
    #修改idx_var['poses_T'] = self.poses_T[idxs]
    idx_var['poses_T'] = self.poses_T[idxs.long()]

    idx_var['verts_smpl_abs'] = idx_var['scale_factor'] * verts + idx_var['poses_T'] # (batch, N, V, 3)
    idx_var['joints_smpl_abs'] = idx_var['scale_factor'] * joints_smpl24 + idx_var['poses_T'] # (batch, N, J, 3)

    #修改idx_var['intrinsics'] = torch.tile(self.cam_intrinsics[idxs], (1, self.num_people, 1, 1)).view(-1, 3, 3)
    idx_var['intrinsics'] = torch.tile(self.cam_intrinsics[idxs.long()], (1, self.num_people, 1, 1)).view(-1, 3, 3)

after that, i got this

DEBUG:: H3DHCustomSequenceData DEBUG:: erode_segmentation_iters 0 DEBUG:: erode_backmask_iters 5 DEBUG:: use_hrnet_pose False DEBUG:: joint_coef_thr 0.5 DEBUG:: max_num_people None Images_path: ./data/mupots-3d-eval/TS1\images Image data: (201, 256, 256, 3) 0 255 Depth data: (201, 256, 256) 0.0 1.0 Segmentation data: (201, 256, 256) 0 4 Background mask data: (201, 256, 256) 0 1 ROMP predictions: 201 dict_keys(['cam', 'poses', 'betas']) Found 201 images with predictions from AlphaPose with idx: [1, 2, 3, 4, 5, 6] AlphaPose:: found max 4 predictions per frame from AlphaPose! AlphaPose data: (201, 4, 17, 3) DEBUG:: pvis [1. 1. 1. 0.] threshold is 0.125 ROMP predictions (final): 201 dict_keys(['cam', 'poses', 'betas', 'valid']) Filtering 2D poses with One-Euro filter. DEBUG:: H3DHCustomSequenceData: using cam: {'K': array([[1.8784500e+02, 0.0000000e+00, 1.2877887e+02], [0.0000000e+00, 1.8923062e+02, 1.2986162e+02], [0.0000000e+00, 0.0000000e+00, 1.2500000e-01]], dtype=float32), 'fov': 68.54204142245207, 'Kd': None, 'image_size': (256, 256)} WARNING: Variable number of images in the batches. len(dataset)=201, batch_size=10 0%| | 0/201 [00:00<?, ?it/s] 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [43:21<00:00, 26.01s/it] D:\Seesea\human\sa-3d-mh\mhmocap\predict.py:157: RuntimeWarning: divide by zero encountered in log axs.plot(np.log(loss_depth), c='b', label='Depth loss') D:\Seesea\human\sa-3d-mh\mhmocap\predict.py:159: RuntimeWarning: divide by zero encountered in log axs.plot(np.log(reg_vel), c='darkorange', label='Reg. 3D Pose Velocity') D:\Seesea\human\sa-3d-mh\mhmocap\predict.py:160: RuntimeWarning: divide by zero encountered in log axs.plot(np.log(reg_filter_verts), c='darkgreen', label='Reg. 3D Vert. Smooth') D:\Seesea\human\sa-3d-mh\mhmocap\predict.py:161: RuntimeWarning: divide by zero encountered in log axs.plot(np.log(reg_ref_poses), c='m', label='Reg. Ref. Poses') D:\Seesea\human\sa-3d-mh\mhmocap\predict.py:162: RuntimeWarning: divide by zero encountered in log axs.plot(np.log(reg_scale), c='y', label='Reg. Scale') D:\Seesea\human\sa-3d-mh\mhmocap\predict.py:163: RuntimeWarning: divide by zero encountered in log axs.plot(np.log(reg_contact), c='k', label='Reg. Contact') D:\Seesea\human\sa-3d-mh\mhmocap\predict.py:164: RuntimeWarning: divide by zero encountered in log axs.plot(np.log(reg_foot_sliding), c='gold', label='Reg. Food Slid.')

DEBUG:: >> stage1_optvar:

scale_factor: [[[nan]]

[[nan]]

[[nan]]]

scene_depth_min / scene_depth_max: [[0.6931472]] [[1.3862944]] 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 201/201 [00:40<00:00, 4.94it/s]

Then I try to vis the result python -m mhmocap.visualization --input_path data\mupots-3d-eval\TS1\output\result_output\TS1 --output_path data\mupots-3d-eval\TS1\output\vis_output

I got this 图片1

No human, only scene.

I tried to reinstall it, but the results didn't change. I don't know what the problem is.

dluvizon commented 9 months ago

Hi @Seeseallllll , Have you checked the values in idxs before casting it to long? This first error that you mentioned is not happening in my side (and it should not).

It also seems that you are running on Windows. I only tried it on Linux machines. Here are the specs where I tested the code: https://github.com/dluvizon/scene-aware-3d-multi-human#11-hwsw-requirements