Closed Abhishek-Quidich closed 1 year ago
@Abhishek-Quidich Here is a reference case, you can make further modifications according to your needs. :blush:
import cv2
from ultralytics import YOLO
import numpy as np
import os
import torch
def show_mask_track(annotation, color_dict):
num_masks = len(annotation)
areas = torch.sum(annotation, dim=(1, 2))
sorted_indices = torch.argsort(areas, descending=False)
annotation = annotation[sorted_indices]
colored_masks = annotation[...,None] * color_dict[:num_masks,None,None,:] * 255.0
result = np.sum(colored_masks.cpu().numpy(), axis=0)
return result.astype(np.uint8)
max_det = 300
video_path = 'your_video_path'
cap = cv2.VideoCapture(video_path)
model = YOLO("your_model.pt")
save_path = 'your_save_dir' + os.path.split(video_path)[-1][:-4]
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
if not os.path.exists(save_path):
os.makedirs(save_path)
ret, frame = cap.read()
h, w, _ = frame.shape
video = cv2.VideoWriter(save_path + '/result.mp4', cv2.VideoWriter_fourcc(*'mp4v'), 30, (w, h))
color_dict = torch.rand(max_det, 3, device=device)
while True:
ret, frame = cap.read()
try:
h, w, _ = frame.shape
except:
break
if not ret:
break
results = model(frame, device=device, retina_masks=True, iou=0.7, conf=0.25, imgsz=1024, max_det=max_det)
masks = results[0].masks.data
mask = show_mask_track(masks,color_dict)
frame = cv2.addWeighted(frame, 1, mask, 0.7, 0)
video.write(frame)
video.release()
Hi @Abhishek-Quidich ,
I noticed that the issue you reported seems to be resolved based on my last response.
I would like to close this issue for now to keep the issue tracker organized. However, if the problem persists or if you have any further questions, please feel free to comment here or open a new issue. We value your input and are happy to assist further.
Thank you for your understanding!
Best Regards, Yongqi An
Is it possible to process RTSP (Real Time Streaming Protocol) video by applying the code just described? If you can figure it out, I would appreciate your help.
Best Regards, mk622
@mk622 is this code work? Replace the cap in the code above. Nothing else needs to be modified.
import cv2
stream_url = "rtsp://your_rtsp_stream_url"
cap = cv2.VideoCapture(stream_url)
Thanks for your advice, and I could the rtsp to mp4 as below code
import cv2 from ultralytics import YOLO import numpy as np import os import torch
import cv2
user_id = "user_id" user_pw = "user_pw" host = "host"
stream_url = f"rtsp://{user_id}:{user_pw}@{host}/MediaInput/h264"
cap = cv2.VideoCapture(stream_url)
save_path = './output'
def show_mask_track(annotation, color_dict): num_masks = len(annotation) areas = torch.sum(annotation, dim=(1, 2)) sorted_indices = torch.argsort(areas, descending=False) annotation = annotation[sorted_indices] colored_masks = annotation[...,None] color_dict[:num_masks,None,None,:] 255.0 result = np.sum(colored_masks.cpu().numpy(), axis=0) return result.astype(np.uint8)
max_det = 300
model = YOLO("./weights/FastSAM-x.pt")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
if not os.path.exists(save_path): os.makedirs(save_path)
ret, frame = cap.read() h, w, _ = frame.shape video = cv2.VideoWriter(save_path + '/result.mp4', cv2.VideoWriter_fourcc(*'mp4v'), 30, (w, h)) color_dict = torch.rand(max_det, 3, device=device)
while True: ret, frame = cap.read() try: h, w, _ = frame.shape except: break if not ret: break
results = model(frame, device=device, retina_masks=True, iou=0.43, conf=0.25, imgsz=1024, max_det=max_det)
masks = results[0].masks.data
mask = show_mask_track(masks,color_dict)
frame = cv2.addWeighted(frame, 1, mask, 0.3, 0)
video.write(frame)
video.release()
But, What I really want to achieve is to display rtsp video in real time. *Real time is difficult because of the segmentation process, but I would like to have a continuous capture displayed in real time for each process. Sorry for asking in a different part of the question than the essence, but I would be glad if you could help me.
Best Regards, mk622
It would be a pleasure to help you. May I ask if you want to simply inference and output for each frame, or do you want to track the video? The former is easy to implement and I can help you now, but the visualization color per frame is randomly changing, the latter we will also release soon (FastSAM for Tracking).