about training (video/photo) and different size of betas in datasets

GhostLate commented 1 year ago

I saw on the main page, that SMPLer-X inference scripts expect video data as input. Is it possible to modify model to support single images?
Do you train your model using images as video (in strict sequence)?
BEDLAM dataset has 11 betas and only one neutral gender (according to the base model). AGORA has only 10 betas. How do you combine different numbers of shapes while training and in the model's head? Does SMPLX regression layer have dynamic size?

Thank you!

Wei-Chen-hub commented 1 year ago

Hi, I could anwser the first 2 questions. For Concern1, yes, single image is possible, just need it to take it as a single-frame sequence, a more straightforward and smple to use inference script is yet to be merged, please see this issue for quick solution. For Concern2, we use smplx instances when training the model (temporal information not addressed). Hope this could help you.

GhostLate commented 1 year ago

Thank you! After some tests, betas can be converted from 10 to 11 by .reshape(11). P.s. But it's not possible to convert from 11 to 10 (unless last one is 0)

GhostLate commented 1 year ago

Hi, I could anwser the first 2 questions. For Concern1, yes, single image is possible, just need it to take it as a single-frame sequence, a more straightforward and smple to use inference script is yet to be merged, please see this issue for quick solution. For Concern2, we use smplx instances when training the model (temporal information not addressed). Hope this could help you.

Data converting scripts #23

I tried to convert BEDLAM to Human_Data, but it was impossible. There is not keypoints_to_scaled_bbox_bfh in data/data_converters/bedlam.py which suppose to be in: utils/demo_utils.py (from mmhuman3d.utils.demo_utils import keypoints_to_scaled_bbox_bfh)

@Wei-Chen-hub, Could you share, please?

GhostLate commented 8 months ago

@Wei-Chen-hub, Any ideas?

Wei-Chen-hub commented 7 months ago

Hi, i just put my function here. Currently we are not planning to release humandata due to short of hands and licensing issues.

    def _keypoints_to_scaled_bbox_bfh(self,
                                    keypoints,
                                    occ=None,
                                    body_scale=1.0,
                                    fh_scale=1.0,
                                    convention='smplx'):
        '''Obtain scaled bbox in xyxy format given keypoints
        Args:
            keypoints (np.ndarray): Keypoints
            scale (float): Bounding Box scale
        Returns:
            bbox_xyxy (np.ndarray): Bounding box in xyxy format
        '''
        bboxs = []

        # supported kps.shape: (1, n, k) or (n, k), k = 2 or 3
        if keypoints.ndim == 3:
            keypoints = keypoints[0]
        if keypoints.shape[-1] != 2:
            keypoints = keypoints[:, :2]

        for body_part in ['body', 'head', 'left_hand', 'right_hand']:
            if body_part == 'body':
                scale = body_scale
            else:
                scale = fh_scale
            bp = self.kps_body_part[body_part]
            kp_id = list(range(bp[0], bp[1]))
            kps = keypoints[kp_id]

            if occ is not None:
                occ_p = occ[kp_id]
                if np.sum(occ_p) / len(kp_id) >= 0.1:
                    conf = 0
                else:
                    conf = 1
            else:
                conf = 1
            if body_part == 'body':
                conf = 1

            xmin, ymin = np.amin(kps, axis=0)
            xmax, ymax = np.amax(kps, axis=0)

            width = (xmax - xmin) * scale
            height = (ymax - ymin) * scale

            x_center = 0.5 * (xmax + xmin)
            y_center = 0.5 * (ymax + ymin)
            xmin = x_center - 0.5 * width
            xmax = x_center + 0.5 * width
            ymin = y_center - 0.5 * height
            ymax = y_center + 0.5 * height

            bbox = np.stack([xmin, ymin, xmax, ymax, conf],
                            axis=0).astype(np.float32)
            bboxs.append(bbox)

        return bboxs

Wei-Chen-hub commented 7 months ago

I would also like to share some insights on Betas. Accidentally i visualized some mocap data with only the first Betas (1st of 10 parameters) correct. The overlay is quite good already. Perhaps only the frist few Betas matter. emdb_else_240110_42_13207

GhostLate commented 7 months ago

Thank you for share!

GhostLate commented 7 months ago

@Wei-Chen-hub, could you also share: bp = self.kps_body_part[body_part] in def _keypoints_to_scaled_bbox_bfh()

jameskuma commented 5 months ago

I would also like to share some insights on Betas. Accidentally i visualized some mocap data with only the first Betas (1st of 10 parameters) correct. The overlay is quite good already. Perhaps only the frist few Betas matter.

Hi, could you kindly share the code of aligning the output mesh from smplx to the image. I am confused to implement this part. Thanks a lot!!

caizhongang / SMPLer-X

about training (video/photo) and different size of betas in datasets #24