Video input - Githubissues

rakage commented 4 months ago

Hello, i wonder this can be use with video input? Really looking forward!

mantasu commented 4 months ago

Hey, video input is not supported - you'd have to manually write a script that processes video frame by frame. But it's a great suggestion! I'm even thinking of adding a GUI in the future.

rakage commented 4 months ago

Hi,

i have made this using your models and load it after that im using opencv to detect frame by frame

from torchvision import models, transforms
from PIL import Image
import numpy as np
import cv2

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = models.segmentation.lraspp_mobilenet_v3_large(pretrained=False, num_classes=1)
model.load_state_dict(torch.load('segmentation_full_lraspp_mobilenet_v3_large.pth', map_location=device))
model.eval()  # Set the model to evaluation mode
model.to(device)  # Move the model to the appropriate device

preprocess = transforms.Compose([
    transforms.Resize((512, 512)),  # Resize the image to the required input size
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    ),
])

cap = cv2.VideoCapture(0)  # Use the default webcam (index 0)

if not cap.isOpened():
    print("Error: Unable to open webcam.")
    exit()

cv2.namedWindow('Segmentation', cv2.WINDOW_NORMAL)

while True:
    ret, frame = cap.read()
    if not ret:
        print("Error: Unable to read frame from webcam.")
        break

    frame_pil = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))

    input_tensor = preprocess(frame_pil)
    input_batch = input_tensor.unsqueeze(0).to(device)  # Add a batch dimension and move to device

    with torch.no_grad():
        output = model(input_batch)['out']

    segmentation_mask = torch.sigmoid(output).squeeze().cpu().numpy()
    segmentation_mask = (segmentation_mask > 0.9).astype(int)  # Convert to binary mask
    segmentation_mask = Image.fromarray(segmentation_mask.astype(np.uint8) * 255)
    segmentation_mask = segmentation_mask.resize(frame_pil.size, Image.NEAREST)
    segmentation_mask = np.array(segmentation_mask)

    segmented_frame = np.array(frame_pil)
    segmented_frame[segmentation_mask == 255] = [255, 0, 0]  # Set red color where segmentation mask is positive

    cv2.imshow('Segmentation', cv2.cvtColor(segmented_frame, cv2.COLOR_RGB2BGR))

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

mantasu commented 4 months ago

Nice, a neat script for live video processing. Thanks for sharing!

YPFHXH commented 3 months ago

导包出问题：Traceback (most recent call last): File "D:\yolo\glasses-detector-1.0.1\scripts\run.py", line 20, in from glasses_detector import GlassesClassifier, GlassesDetector, GlassesSegmenter ImportError: cannot import name 'GlassesClassifier' from 'glasses_detector' (C:\Users\YPF\anaconda3\envs\glass1\Lib\site-packages\glasses_detector__init__.py). Did you mean: 'AnyglassesClassifie r'? 请问这如何解决？

mantasu commented 3 months ago

导包出问题：Traceback (most recent call last): File "D:\yolo\glasses-detector-1.0.1\scripts\run.py", line 20, in from glasses_detector import GlassesClassifier, GlassesDetector, GlassesSegmenter ImportError: cannot import name 'GlassesClassifier' from 'glasses_detector' (C:\Users\YPF\anaconda3\envs\glass1\Lib\site-packages\glasses_detectorinit.py). Did you mean: 'AnyglassesClassifie r'? 请问这如何解决？

Seems like you get a reference to the previous version. You have to use Python 3.12 and the latest package version v1.0.1. Also make sure the older version is uninstalled from your environment.

mantasu / glasses-detector

Video input #11