Issue with process_video.py

Search before asking

[X] I have searched the Pytorch-Wildlife issues and found no similar bug report.

Bug

Hi all,

I am trying to run the Megadetector in Pytorch-Wildlife to detect animals in videos. I have been following the online notebook provided here.

However, following the steps in this notebook results in the following error:

I realize that the classification step is not appropriate for my type of data, since I am not working with possums - but I am just testing the functionality.

Environment

No response

Minimal Reproducible Example

My code looks as follows: ` from PIL import Image import numpy as np import supervision as sv import torch from PytorchWildlife.models import detection as pw_detection from PytorchWildlife.models import classification as pw_classification from PytorchWildlife.data import transforms as pw_trans from PytorchWildlife import utils as pw_utils import os

DEVICE = "cuda" if torch.cuda.is_available() else "cpu" SOURCE_VIDEO_PATH = "/mnt/path_here/Videos/brocket_deer/58386bb1-3b61-4c5c-8d26-a8505e68e827.mp4" TARGET_VIDEO_PATH = "/mnt/path_here/Videos/brocket_deer/58386bb1-3b61-4c5c-8d26-a8505e68e827_processed.mp4" detection_model = pw_detection.MegaDetectorV5(device=DEVICE, pretrained=True) classification_model = pw_classification.AI4GOpossum(device=DEVICE, pretrained=True)

trans_det = pw_trans.MegaDetector_v5_Transform(target_size=detection_model.IMAGE_SIZE, stride=detection_model.STRIDE) trans_clf = pw_trans.Classification_Inference_Transform(target_size=224)

box_annotator = sv.BoxAnnotator(thickness=4)

def callback(frame: np.ndarray, index: int) -> np.ndarray: results_det = detection_model.single_image_detection(trans_det(frame), frame.shape, index) labels = [] for xyxy in results_det["detections"].xyxy: cropped_image = sv.crop_image(image=frame, xyxy=xyxy) results_clf = classification_model.single_image_classification(trans_clf(Image.fromarray(cropped_image))) labels.append("{} {:.2f}".format(results_clf["prediction"], results_clf["confidence"])) annotated_frame = box_annotator.annotate(scene=frame, detections=results_det["detections"], labels=labels) return annotated_frame

pw_utils.process_video(source_path=SOURCE_VIDEO_PATH, target_path=TARGET_VIDEO_PATH, callback=callback, target_fps=5) `

Additional

No response

Are you willing to submit a PR?

[ ] Yes I'd like to help by submitting a PR!

microsoft / CameraTraps