Run on arbitary video - Githubissues

LeKiet258 commented 1 year ago

Hi, I want to run SmoothNet on an arbitary video, but it seems like a ground truth for that video is required even for the inference phase, right?

juxuan27 commented 1 year ago

Hi @LeKiet258 ! Thank you for your focus. Ground truth is not needed for inference, however, you might need to process your own data following instructions here. You can directly leave gt value as 0 if you don't have them. If you want to run inference with visualization, you can also refer to mmhuman3d, which support smoothnet.

LeKiet258 commented 1 year ago

Thanks for your fast reply! MMHuman3d currently doesn't support my model of choice yet, which is HybrIK, so I have to respectively clone HybrIK to produce the output and then run SmoothNet on this output, hence my question. As for the gt value set to 0, did you mean by setting the pose and shape parameters ? (in my case, I use SMPL body representation)

Screenshot 2023-02-28 143046

juxuan27 commented 1 year ago

In this repo, we have not provide inference pipeline. So you can use the evaluation pipeline for testing. Due to the evaluation process need ground truth to calculate metrics, we recommend to add groundtruth in data processing. However, for the reason that you only need the inference part, you can set the ground truth value to any random value (e.g., 0). But you may still need to modify the code to obtain the smoothed output pose from the whole pipeline.

ghost commented 1 year ago

@LeKiet258 Could you share your repo? I am also trying SmoothNet with HybrIK without success.

@juxuan27 a inference pipeline is very useful for people to try your work. Please consider adding one

ailingzengzzz commented 1 year ago

Hi @nick008a ,

Thanks for the kind suggestion! We will add it in March.

ghost commented 1 year ago

@ailingzengzzz thank you for the reply!! Would be even better if it can be used something like HybrIK

LeKiet258 commented 1 year ago

@nick008a I haven't successfully combined HybrIK with SmoothNet yet, so I changed HybrIK with another model in order to run. Sorry :(

lucasjinreal commented 1 year ago

@LeKiet258 @nick008a @ailingzengzzz I have a simillar question, from the code provided on 2d, the dataset preparing for evaluation actually used a middle postiion data as normalization reference: https://github.com/cure-lab/SmoothNet/blob/be3f9ce43399aa7c6c3c5400f2d5fdcfd958cfc4/lib/dataset/h36m_dataset.py#L164

how to do that in realtime situation say I have 16 length only? And noticed it normalized with a Magic number H36M_IMageshape which is 1000, but when keypoints detected on arbitary videos, the image shapes can be various, does it matter or not?

ailingzengzzz commented 1 year ago

Hi @lucasjinreal ,

I'm sorry for the late reply. I check this version and find I did not present the general normalization. Here you can use the following functions to normalize with arbitrary videos and input the normalized keypoint positions into SmoothNet, and then denormalize the smoothed keypoints into the image coordinate.

def normalize_screen_coordinates(X, w, h):
    # Normalize so that [0, w] is mapped to [-1, 1], while preserving the aspect ratio
    if X.shape[-1] == 3: #input 3d pose
        X_norm = X[..., :2]
        X_norm = X_norm / w * 2 - [1, h / w]
        X_out = np.concatenate((X_norm, X[..., 2:3] / 1000), -1)
    else:
        assert X.shape[-1] == 2
        X_out = X / w * 2 - [1, h / w]
    return X_out

def image_coordinates(X, w, h):
    # Reverse camera frame normalization
    if X.shape[-1] == 3: #input 3d pose
        X_norm = X[..., :2]
        X_norm[..., :1] = (X_norm[..., :1] + 1) * w / 2
        X_norm[..., 1:2] = (X_norm[..., 1:2] + h / w) * w / 2
        X_out = torch.cat([X_norm, X[..., 2:3] * 1000], -1)
    else:
        assert X.shape[-1] == 2
        X_out = (X + [1, h / w]) * w / 2
    return X_out

ywyw1 commented 1 year ago

May I ask what method you used？

cure-lab / SmoothNet

Run on arbitary video #34