The GPU is not used when running detection with YOLOv5 #13171

Open Angelinnp opened 3 months ago

Angelinnp commented 3 months ago

WhatsApp Image 2024-07-04 at 09 58 08 When I run the YOLOv5 detection code, it still uses CPU. And it causes the detection process to be slow, I get fps = 0.4. For installation, CUDA has been activated but the CUDA on the Jetson nano is still not used. Please give me an explanation why it happened and what is the solution? The following are the versions of CUDA 10.2.300 and pytorch 2.3.1 that I have installed. I use the virtual environment Python 3.8.0. Please tell which version of Pytorch and CUDA suits my python virtual environment. Please help me WhatsApp Image 2024-07-05 at 21 29 04 versi cuda


glenn-jocher commented 3 months ago

@Angelinnp hello,

Thank you for reaching out and providing detailed information about your issue. It looks like you're experiencing difficulties with GPU utilization on your Jetson Nano while running YOLOv5 detection.

To better assist you, could you please provide a minimal reproducible code example? This will help us understand the context and reproduce the issue on our end. You can find more information on creating a minimal reproducible example here. This step is crucial for us to investigate and provide a solution effectively.

In the meantime, here are a few steps you can take to troubleshoot the issue:

  1. Verify CUDA and PyTorch Compatibility: Ensure that your CUDA and PyTorch versions are compatible. For Jetson Nano, it is recommended to use the versions provided by NVIDIA's JetPack SDK, which ensures compatibility. You can check the compatibility matrix on the NVIDIA website.

  2. Check GPU Availability in PyTorch: Run the following code to verify that PyTorch detects your GPU:

    import torch

    If torch.cuda.is_available() returns False, there might be an issue with your CUDA installation.

  3. Ensure YOLOv5 is Configured to Use GPU: When running detection, make sure to specify the --device argument to use the GPU. For example:

    python detect.py --source your_source --weights yolov5s.pt --device 0

    This command explicitly tells YOLOv5 to use the first GPU.

  4. Update YOLOv5 and Dependencies: Ensure you are using the latest version of YOLOv5 and its dependencies. You can update YOLOv5 by running:

    git pull
    pip install -r requirements.txt

If the issue persists after trying the above steps, please share the minimal reproducible code example, and we will investigate further.

Thank you for your cooperation, and we look forward to resolving this issue with you.

Angelinnp commented 3 months ago

The following is the detect program code that I run. please help me solve this problem. [Uploading code deteksi.pdf…]()

Angelinnp commented 3 months ago

The following is the detect program code that I run. please help me solve this problem.

import argparse import os import platform import sys import serial from pathlib import Path from turtle import distance import pynmea2

import torch import time

FILE = Path(file).resolve() ROOT = FILE.parents[0] # YOLOv5 root directory if str(ROOT) not in sys.path: sys.path.append(str(ROOT)) # add ROOT to PATH ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative

from models.common import DetectMultiBackend from utils.dataloaders import IMG_FORMATS, VID_FORMATS, LoadImages, LoadScreenshots, LoadStreams from utils.general import (LOGGER, Profile, check_file, check_img_size, check_imshow, check_requirements, colorstr, cv2, increment_path, non_max_suppression, print_args, scale_boxes, strip_optimizer, set_logging, xyxy2xywh) from utils.plots import Annotator, colors, save_one_box from utils.torch_utils import select_device, smart_inference_mode

Konfigurasi GPS Sensor

serialport = serial.Serial( port="/dev/ttyTHS1", baudrate=9600, bytesize=serial.EIGHTBITS, parity=serial.PARITY_NONE, stopbits=serial.STOPBITS_ONE, ) time.sleep(1)

Konfigurasi GPS Information

def parse_nmea(sentence): latitude = longitude = altitude = altitude_units = None

if sentence.startswith('$GPGGA'):
    msg = pynmea2.parse(sentence)
    latitude = msg.latitude
    longitude = msg.longitude
    altitude = msg.altitude
    altitude_units = msg.altitude_units
elif sentence.startswith('$GPRMC'):
    msg = pynmea2.parse(sentence)
    latitude = msg.latitude
    longitude = msg.longitude

return latitude, longitude, altitude, altitude_units

latitude = None longitude = None

@smart_inference_mode() def run( weights=ROOT / 'yolov5s.pt', # model path or triton URL source=ROOT / 'data/images', # file/dir/URL/glob/screen/0(webcam) data=ROOT / 'data/coco128.yaml', # dataset.yaml path imgsz=(640, 640), # inference size (height, width) conf_thres=0.1, # confidence threshold iou_thres=0.9, # NMS IOU threshold max_det=1000, # maximum detections per image device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu view_img=False, # show results save_txt=False, # save results to *.txt save_conf=False, # save confidences in --save-txt labels save_crop=False, # save cropped prediction boxes nosave=False, # do not save images/videos classes=None, # filter by class: --class 0, or --class 0 2 3 agnostic_nms=False, # class-agnostic NMS augment=False, # augmented inference visualize=False, # visualize features update=False, # update all models project=ROOT / 'runs/detect', # save results to project/name name='exp', # save results to project/name exist_ok=False, # existing project/name ok, do not increment line_thickness=3, # bounding box thickness (pixels) hide_labels=False, # hide labels hide_conf=False, # hide confidences half=False, # use FP16 half-precision inference dnn=False, # use OpenCV DNN for ONNX inference vid_stride=1, # video frame-rate stride ): source = str(source) save_img = not nosave and not source.endswith('.txt') # save inference images is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS) is_url = source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://')) webcam = source.isnumeric() or source.endswith('.txt') or (is_url and not is_file) screenshot = source.lower().startswith('screen') if is_url and is_file: source = check_file(source) # download

# Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok)  # increment run
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True)  # make dir

# Initialize
device = select_device('0' if torch.cuda.is_available() else 'cpu')
half &= device.type != 'cpu'  # presisi setengah hanya didukung pada CUDA

# Load model
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
stride, names, pt = model.stride, model.names, model.pt
imgsz = check_img_size(imgsz, s=stride)  # cek ukuran gambar

# Dataloader
bs = 1  # batch_size
if webcam:
    view_img = check_imshow(warn=True)
    dataset = LoadStreams(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
    bs = len(dataset)
elif screenshot:
    dataset = LoadScreenshots(source, img_size=imgsz, stride=stride, auto=pt)
    dataset = LoadImages(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
vid_path, vid_writer = [None] * bs, [None] * bs

# Run inference
model.warmup(imgsz=(1 if pt or model.triton else bs, 3, *imgsz))  # warmup
seen, windows, dt = 0, [], (Profile(), Profile(), Profile())
for path, im, im0s, vid_cap, s in dataset:
    with dt[0]:
        im = torch.from_numpy(im).to(model.device)
        im = im.half() if model.fp16 else im.float()  # uint8 to fp16/32
        im /= 255  # 0 - 255 to 0.0 - 1.0
        if len(im.shape) == 3:
            im = im[None]  # expand for batch dim

    # Inference
    with dt[1]:
        visualize = increment_path(save_dir / Path(path).stem, mkdir=True) if visualize else False
        pred = model(im, augment=augment, visualize=visualize)

    # NMS
    with dt[2]:
        pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)

    # Second-stage classifier (optional)
    # pred = utils.general.apply_classifier(pred, classifier_model, im, im0s)

    # Process predictions
    for i, det in enumerate(pred):  # per image
        seen += 1
        if webcam:  # batch_size >= 1
            p, im0, frame = path[i], im0s[i].copy(), dataset.count
            s += f'{i}: '
            p, im0, frame = path, im0s.copy(), getattr(dataset, 'frame', 0)

        p = Path(p)  # to Path
        save_path = str(save_dir / p.name)  # im.jpg
        txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}')  # im.txt
        s += '%gx%g ' % im.shape[2:]  # print string
        gn = torch.tensor(im0.shape)[[1, 0, 1, 0]]  # normalization gain whwh
        imc = im0.copy() if save_crop else im0  # for save_crop
        annotator = Annotator(im0, line_width=line_thickness, example=str(names))
        if len(det):
            # Rescale boxes from img_size to im0 size
            det[:, :4] = scale_boxes(im.shape[2:], det[:, :4], im0.shape).round()

            # Print results
            for c in det[:, -1].unique():
                n = (det[:, -1] == c).sum()  # detections per class
                s += f"{n} {names[int(c)]}{'s' * (n > 1)}, "  # add to string
                detnum = det.cpu().numpy()
                len_det = len(detnum)
                for count in range(len_det):
                    if serialport.in_waiting:
                        data = serialport.readline().decode('utf-8').strip()
                        latitude, longitude, altitude, altitude_units = parse_nmea(data)
                        if latitude is not None and longitude is not None:
                            print(f"Latitude: {latitude:.6f}, Longitude: {longitude:.6f}")
                        elif altitude is not None:
                            print(f"Altitude: {altitude:.2f} {altitude_units if altitude_units else ''}")

                    #print("conf=%d ; xmin=%d ; xmax=%d ; ymin=%d ; ymax=%d ; deltax = %d ; deltay = %d ; dist=%d" % (detconf, detxmin, detxmax, detymin, detymax, deltax, deltay, dist2))
                    #print("Jarak = %d" % (dist2))
                    #print("conf=%d ; xmin=%d ; xmax=%d ; ymin=%d ; ymax=%d" % (detconf, detxmin, detxmax, detymin, detymax))

                    #detdist = calcDist(detClass, detw)
                    #rint("distance : ", detdist)
                    #for z in range(0,len(detymintemp)):
                        #for y in range(z+1,len(detymintemp)):
                                #temp = detymintemp[z]
                                #detymintemp[z] = detymintemp[y]
                                #detymintemp[y] = temp

                    #print latitude

            # Write results
            for *xyxy, conf, cls in reversed(det):
                if save_txt:  # Write to file
                    xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
                    line = (cls, *xywh, conf) if save_conf else (cls, *xywh)  # label format
                    with open(f'{txt_path}.txt', 'a') as f:
                        f.write(('%g ' * len(line)).rstrip() % line + '\n')

                if save_img or save_crop or view_img:  # Add bbox to image
                    c = int(cls)  # integer class
                    label = None if hide_labels else (names[c] if hide_conf else f'{names[c]} {conf:.2f}')
                    annotator.box_label(xyxy, label, color=colors(c, True))
                if save_crop:
                    save_one_box(xyxy, imc, file=save_dir / 'crops' / names[c] / f'{p.stem}.jpg', BGR=True)

        # Stream results
       # Stream results
        im0 = annotator.result()
        # Tambahkan ini untuk FPS di sudut kanan atas
        fps = 1 / dt[1].dt  # Kalkulasi FPS dari waktu inferensi
        cv2.putText(im0, f"FPS: {fps:.1f}", (im0.shape[1] - 150, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
        if view_img:
            if platform.system() == 'Linux' and p not in windows:
                cv2.namedWindow(str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO)  # cv2.WINDOW_NORMAL
                cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
            cv2.imshow(str(p), im0)
            cv2.waitKey(1)  # 1 millisecond

        # Save results (image with detections)
        if save_img:
            if dataset.mode == 'image':
                cv2.imwrite(save_path, im0)
            else:  # 'video' or 'stream'
                if vid_path[i] != save_path:  # new video
                    vid_path[i] = save_path
                    if isinstance(vid_writer[i], cv2.VideoWriter):
                        vid_writer[i].release()  # release previous video writer
                    if vid_cap:  # video
                        fps = vid_cap.get(cv2.CAP_PROP_FPS)
                        w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
                        h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
                    else:  # stream
                        fps, w, h = 30, im0.shape[1], im0.shape[0]
                    save_path = str(Path(save_path).with_suffix('.mp4'))  # force *.mp4 suffix on results videos
                    vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))

    # Print time (inference-only)
    LOGGER.info(f"{s}{'' if len(det) else '(no detections), '}{dt[1].dt * 1E3:.1f}ms")

# Print results
t = tuple(x.t / seen * 1E3 for x in dt)  # speeds per image
LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}' % t)
if save_txt or save_img:
    s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
    LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
if update:
    strip_optimizer(weights[0])  # update model (to fix SourceChangeWarning)

def parse_opt(): parser = argparse.ArgumentParser() parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s.pt', help='model path or triton URL') parser.add_argument('--source', type=str, default=ROOT / 'data/images', help='file/dir/URL/glob/screen/0(webcam)') parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='(optional) dataset.yaml path') parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w') parser.add_argument('--conf-thres', type=float, default=0.2, help='confidence threshold') parser.add_argument('--iou-thres', type=float, default=0.1, help='NMS IoU threshold') parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image') parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') parser.add_argument('--view-img', action='store_true', help='show results') parser.add_argument('--save-txt', action='store_true', help='save results to .txt') parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels') parser.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes') parser.add_argument('--nosave', action='store_true', help='do not save images/videos') parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --classes 0, or --classes 0 2 3') parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS') parser.add_argument('--augment', action='store_true', help='augmented inference') parser.add_argument('--visualize', action='store_true', help='visualize features') parser.add_argument('--update', action='store_true', help='update all models') parser.add_argument('--project', default=ROOT / 'runs/detect', help='save results to project/name') parser.add_argument('--name', default='exp', help='save results to project/name') parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment') parser.add_argument('--line-thickness', default=3, type=int, help='bounding box thickness (pixels)') parser.add_argument('--hide-labels', default=False, action='store_true', help='hide labels') parser.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences') parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference') parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference') parser.add_argument('--vid-stride', type=int, default=1, help='video frame-rate stride') opt = parser.parse_args() opt.imgsz = 2 if len(opt.imgsz) == 1 else 1 # expand print_args(vars(opt)) return opt

def calcDist(myClass, myWidth): myF = 700 if(myClass == 0): myDist = int(myF myWidth / 50) elif(myClass == 1): myDist = int(myF myWidth / 90) else: myDist = 0 return myDist

def sendcalc232(category, confidence, x, y, w, h, dist): toWrite = "$" + "," + str(category) + "," + str(confidence) + "," + str(x) + "," + str(y) + "," + str(w) + "," + str(h) + "," + str(dist) value = write_read(toWrite) print(value)

def write_read(x):

send_data.write(bytes(x, 'utf-8'))

# data = send_data.readline()
# return data

def main(opt): check_requirements(exclude=('tensorboard', 'thop')) run(**vars(opt))

if name == "main": opt = parse_opt() main(opt)

glenn-jocher commented 3 months ago

Hello @Angelinnp,

Thank you for sharing your detection code. I see that you've integrated GPS sensor data and are running YOLOv5 on a Jetson Nano. Let's address the issue of the GPU not being utilized.

Steps to Ensure GPU Utilization

  1. Verify CUDA and PyTorch Compatibility: Ensure that your CUDA and PyTorch versions are compatible with each other and with your Jetson Nano. For Jetson Nano, it's recommended to use the versions provided by NVIDIA's JetPack SDK. You can find the compatibility matrix on the NVIDIA website.

  2. Check GPU Availability in PyTorch: Run the following code snippet to verify that PyTorch detects your GPU:

    import torch
    print(torch.cuda.is_available())  # Should return True
    print(torch.cuda.current_device())  # Should return the GPU device index
    print(torch.cuda.get_device_name(0))  # Should return the name of your GPU

    If torch.cuda.is_available() returns False, there might be an issue with your CUDA installation.

  3. Ensure YOLOv5 is Configured to Use GPU: In your run function, you are already using select_device('0' if torch.cuda.is_available() else 'cpu'). Ensure that torch.cuda.is_available() returns True as mentioned above.

  4. Update YOLOv5 and Dependencies: Make sure you are using the latest version of YOLOv5 and its dependencies. You can update YOLOv5 by running:

    git pull
    pip install -r requirements.txt
  5. Specify the CUDA Device Explicitly: When running the detection script, make sure to specify the --device argument to use the GPU. For example:

    python detect.py --source your_source --weights yolov5s.pt --device 0

Example Code Adjustments

Here are some adjustments to ensure GPU utilization:

  1. Ensure select_device is correctly set:

    device = select_device('0' if torch.cuda.is_available() else 'cpu')
  2. Run the script with the --device argument:

    python detect.py --source your_source --weights yolov5s.pt --device 0

Additional Debugging

If the above steps do not resolve the issue, please provide the output of the following commands:

  1. python -c "import torch; print(torch.cuda.is_available())"
  2. python -c "import torch; print(torch.cuda.get_device_name(0))"

This will help us understand if PyTorch is correctly detecting your GPU.

Thank you for your cooperation, and we look forward to resolving this issue with you. If you have any further questions or need additional assistance, please feel free to ask.

