Using model with multiple images or video

akashsengupta1997 / HuManiFlow

[CVPR 2023] Code repository for HuManiFlow: Ancestor-Conditioned Normalising Flows on SO(3) Manifolds for Human Pose and Shape Distribution Estimation

MIT License

74 stars 2 forks source link

What is the best way to update the code to work on multiple images or a video? I attempted to use VideoCapture on a gif file to read each frame. However, I am having difficulty appending each image and heatmap together to be fed into the model.

This is in the predict_humaniflow.py script:

for image_fname in tqdm(sorted([f for f in os.listdir(image_dir)])):
        with torch.no_grad():
            # Capture video from file
            cap = cv2.VideoCapture(os.path.join(image_dir, image_fname))
            # Capture frame-by-frame
            ret, frame = cap.read()
            frames = []
            while ret:
                # ------------------------- INPUT LOADING AND PROXY REPRESENTATION GENERATION -------------------------
                image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

                ...............

                frames.append(torch.cat([proxy_rep_img, proxy_rep_heatmaps], dim=1))
                ret, frame = cap.read()
                if not ret:
                    break

            cap.release()
            cv2.destroyAllWindows()
            proxy_rep_input = torch.cat([x.float() for x in frames], dim=1).float()  # (1, 18, img_wh, img_wh)

akashsengupta1997 / HuManiFlow

Using model with multiple images or video #1