magisystem0408 / yolov5-DeepSort-RealSenseD435i

realsense+yolov5+deepsense D435i
GNU General Public License v3.0
16 stars 3 forks source link

RuntimeError: The expanded size of the tensor (1) must match the existing size (80) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 48, 80, 2] #3

Closed ChenJiajun0011 closed 2 years ago

ChenJiajun0011 commented 2 years ago

Hi, I am trying to run the realsence_track.py. This error exist. Would you please take a look and share your solution? Thanks.

Loading weights from deep_sort_pytorch/deep_sort/deep/checkpoint/ckpt.t7... Done! YOLOv5 🚀 v6.0-206-gc43439a torch 1.10.1+cu102 CPU

Fusing layers... Model Summary: 270 layers, 7235389 parameters, 0 gradients Traceback (most recent call last): File "realsence_track.py", line 207, in test = realsence() File "realsence_track.py", line 86, in realsence pred = model(img, augment=False)[0] File "/home/chan/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/home/chan/catkin_ws/src/personalize_hand/yolov5/yolov5/models/yolo.py", line 126, in forward return self._forward_once(x, profile, visualize) # single-scale inference, train File "/home/chan/catkin_ws/src/personalize_hand/yolov5/yolov5/models/yolo.py", line 149, in _forward_once x = m(x) # run File "/home/chan/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, **kwargs) File "/home/chan/catkin_ws/src/personalize_hand/yolov5/yolov5/models/yolo.py", line 58, in forward self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i) RuntimeError: The expanded size of the tensor (1) must match the existing size (80) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 48, 80, 2]

ChenJiajun0011 commented 2 years ago

こんにちは、解決策はありますか?

magisystem0408 commented 2 years ago

This repository is a code that connects yolov5, realsense and deepsort, but is it suitable for the purpose of use?

The solution can be implemented later by writing the installation instructions in the documentation.

ChenJiajun0011 commented 2 years ago

This repository is a code that connects yolov5, realsense and deepsort, but is it suitable for the purpose of use?

The solution can be implemented later by writing the installation instructions in the documentation.

Hi. What are you tracking using realsence_track.py? And what about deepsort_webCam.py? Person or hand? I want to use yolov5, realsense D435i and deepsort to track people in real time. Thanks.

magisystem0408 commented 2 years ago

This project From the camera information of realsence, yolov5, deepsort and mediapipe can use person tracking and identification.

It is also possible to rewrite so that people can be tracked in real time using only yolov5, realsense D435i, and deepsort.

Please wait for a while as I will create a new pythonfile and give it to github!

ChenJiajun0011 commented 2 years ago

You will rewrite base on deepsort_webCam.py or realsence_track.py?

magisystem0408 commented 2 years ago

Rewrite based on realsence.py But If it is left as it is, there is a code that is not necessary for the purpose, so delete it.

ChenJiajun0011 commented 2 years ago

Great, would you please message me when you finish this real time person tracking and identification through D435i?Thanks. chenjiajun1011@gmail.com

magisystem0408 commented 2 years ago

image

Actually, you can get information other than people, but do you want to identify it to people?

ChenJiajun0011 commented 2 years ago

Yes, I just focus on identifying and tracking pedestrian and want to output their distance in each frame. Which mean there is a target person and other pedestrian. I need to recognize this target person in each frame and output its distance.

magisystem0408 commented 2 years ago

I wrote the environment construction and execution procedure in the document This time, realsense_track_person.py is valid

ChenJiajun0011 commented 2 years ago

Hi, same problem exist. Do you have any idea what cause this? I have already install the requirements.

Loading weights from deep_sort_pytorch/deep_sort/deep/checkpoint/ckpt.t7... Done! YOLOv5 🚀 v6.0-206-gc43439a torch 1.10.1+cu102 CPU

Fusing layers... Model Summary: 270 layers, 7235389 parameters, 0 gradients Traceback (most recent call last): File "testscript.py", line 150, in test = realsence() File "testscript.py", line 81, in realsence pred = model(img, augment=False)[0] File "/home/chan/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/home/chan/catkin_ws/src/personalize_hand/yolov5/yolov5/models/yolo.py", line 126, in forward return self._forward_once(x, profile, visualize) # single-scale inference, train File "/home/chan/catkin_ws/src/personalize_hand/yolov5/yolov5/models/yolo.py", line 149, in _forward_once x = m(x) # run File "/home/chan/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, **kwargs) File "/home/chan/catkin_ws/src/personalize_hand/yolov5/yolov5/models/yolo.py", line 58, in forward self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i) RuntimeError: The expanded size of the tensor (1) must match the existing size (80) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 48, 80, 2]

magisystem0408 commented 2 years ago

What is the version of python? And this code can use only windows

ChenJiajun0011 commented 2 years ago

python 3.6 and I am using Linux...You have any idea how to make it work in Linux?

ChenJiajun0011 commented 2 years ago

Cause I am also using the ROS. I use the position of the target person in each frame to control the robot moving and following the target person.

magisystem0408 commented 2 years ago

In the first place, I think that yolov5 does not work with 3.6 or something, and I feel that the realsense SDK does not support linux.

ChenJiajun0011 commented 2 years ago

Would you have time to help me out to process your project in Linux? I am quite a freshmen in programming...

magisystem0408 commented 2 years ago

It's nice to help, but what do you use for your linux distribution?

ChenJiajun0011 commented 2 years ago

That's great. I am using Ubuntu 18.04.6 LTS.

magisystem0408 commented 2 years ago

Please upgrade your python version to 3.9. Please set up the environment in pyvenv or conda environment.

ChenJiajun0011 commented 2 years ago

I would like to know why do we need this environment, it's because of yolov5 or deepsort?

ChenJiajun0011 commented 2 years ago

I have upgrade to python 3.9, and also download the anaconda. Then python3.9 realsence_track_person.py. Don't work, same error exist.

magisystem0408 commented 2 years ago

Did you create an environment with conda and execute it in it?

magisystem0408 commented 2 years ago

create conda environment

conda crate -n python=3.8 conda activate

change directory

cd personalize_hand

clone yolov5

git clone https://github.com/ultralytics/yolov5/tree/aa1859909c96d5e1fc839b2746b45038ee8465c9

install requirements

pip install -r requirements.txt

change yolov5 directory

cd yolov5

$ install requirements pip install -r requirements.txt

magisystem0408 commented 2 years ago

yolov5/yolov5/models/yolo.py

Isn't the yolo directory duplicated when you see the error statement?

⭕️yolov5/models/yolo.py

ChenJiajun0011 commented 2 years ago

Have you already successfully execute it in ubuntu 18.04? I would try it again an hour later.

magisystem0408 commented 2 years ago

I'm running on ubuntu. I think the problem is that there is one more directory hierarchy.

ChenJiajun0011 commented 2 years ago

Honestly, I have not idea- -. Did you figure it out to run it on ubuntu?

ChenJiajun0011 commented 2 years ago

You are right, it is now working. I have changed the get distance part, take a look.

import pyrealsense2 as rs import cv2

import sys

import numpy as np

sys.path.insert(0, './yolov5')

from yolov5.models.experimental import attempt_load from yolov5.utils.downloads import attempt_download from yolov5.utils.general import check_img_size, non_max_suppression, scale_coords, xyxy2xywh from yolov5.utils.torch_utils import select_device from yolov5.utils.plots import Annotator, colors from deep_sort_pytorch.utils.parser import get_config from deep_sort_pytorch.deep_sort import DeepSort import math import torch

from utils.augmentations import Albumentations, augment_hsv, copy_paste, letterbox, mixup, random_perspective

WIDTH = 1280 FPS = 6 HEIGHT = 720

ストリーム(Color/Depth)の設定

config = rs.config() config.enable_stream(rs.stream.color, WIDTH, HEIGHT, rs.format.bgr8, FPS) config.enable_stream(rs.stream.depth, WIDTH, HEIGHT, rs.format.z16, FPS)

ストリーミング開始

pipeline = rs.pipeline() profile = pipeline.start(config) align = rs.align(rs.stream.color)

cfg = get_config() cfg.merge_from_file("deep_sort_pytorch/configs/deep_sort.yaml") attempt_download("deep_sort_pytorch/deep_sort/deep/checkpoint/ckpt.t7", repo='mikel-brostrom/Yolov5_DeepSort_Pytorch') deepsort = DeepSort(cfg.DEEPSORT.REID_CKPT, max_dist=cfg.DEEPSORT.MAX_DIST, min_confidence=cfg.DEEPSORT.MIN_CONFIDENCE, max_iou_distance=cfg.DEEPSORT.MAX_IOU_DISTANCE, max_age=cfg.DEEPSORT.MAX_AGE, n_init=cfg.DEEPSORT.N_INIT, nn_budget=cfg.DEEPSORT.NN_BUDGET, use_cuda=True)

device = select_device("cpu") half = device.type != "cpu" # half precision only supported on CUDA

model = attempt_load('yolov5/weights/yolov5s.pt', map_location=device) # load FP32 model stride = int(model.stride.max()) # model stride imgsz = check_img_size(640, s=stride) # check img_size names = model.module.names if hasattr(model, 'module') else model.names # get class names if half: model.half()

def realsence(): try:

3つの配列まで登録を可能にする

    while True:
        # フレーム待ち(Color & Depth)
        frames = pipeline.wait_for_frames()
        aligned_frames = align.process(frames)
        color_frame = aligned_frames.get_color_frame()
        depth_frame = aligned_frames.get_depth_frame()
        if not depth_frame or not color_frame:
            continue
        color_image = np.asanyarray(color_frame.get_data())
        # Depth画像
        depth_color_frame = rs.colorizer().colorize(depth_frame)
        depth_color_image = np.asanyarray(depth_color_frame.get_data())
        img0 = color_image.copy()

        img = letterbox(img0, 640, 32, True)[0]
        # Convert
        img = img.transpose((2, 0, 1))[::-1]  # HWC to CHW, BGR to RGB
        img = np.ascontiguousarray(img)
        img = torch.from_numpy(img).to(device)
        img = img.half() if half else img.float()
        img /= 255.0

        frame_idx = 0

        if img.ndimension() == 3:
            img = img.unsqueeze(0)

        pred = model(img, augment=False)[0]
        pred = non_max_suppression(
            pred, 0.4, 0.5, agnostic=False)

        # 箱一つに対して処理をしている
        for i, det in enumerate(pred):  # detections per image
            im0 = img0
            # print string
            annotator = Annotator(im0, line_width=2, pil=not ascii)

            anotationList = []
            if det is not None and len(det):
                # Rescale boxes from img_size to im0 size
                det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()
                xywhs = xyxy2xywh(det[:, 0:4])
                confs = det[:, 4]
                clss = det[:, 5]
                # pass detections to deepsort
                outputs = deepsort.update(xywhs.cpu(), confs.cpu(), clss.cpu(), im0)
                if len(outputs) > 0:
                    for j, (output, conf) in enumerate(zip(outputs, confs)):
                        bboxes = output[0:4]
                        id = output[4]
                        cls = output[5]

                        c = int(cls)  # integer class
                        if names[c] == "person":
                            label = f'{id} {names[c]} {conf:.2f}'
                            annotator.box_label(bboxes, label, color=colors(c, True))

                            bbox_left = output[0]
                            bbox_top = output[1]
                            bbox_w = output[2] - output[0]
                            bbox_h = output[3] - output[1]

                            center_x = math.floor(bbox_left + (bbox_w / 2))
                            center_y = math.floor(bbox_top + (bbox_h / 2))

                            depth = depth_frame.get_distance(center_x, center_y)

                            anotationList.append(
                                [frame_idx, id, c, names[c], bbox_left, bbox_top, bbox_w, bbox_h, center_x, center_y,
                                depth])

        result_image = annotator.result()

        if len(anotationList)>0:
            for anotation in anotationList:

                # if anotation[3] =="person":
                print(anotation)

                cv2.putText(result_image,str(anotation[10]),(int(anotation[8]),int(anotation[9])),cv2.FONT_HERSHEY_PLAIN,5,(0,0,255),3,cv2.LINE_AA)

        images = np.hstack((result_image, depth_color_image))
        cv2.namedWindow('RealSense', cv2.WINDOW_AUTOSIZE)
        cv2.imshow('RealSense', images)

        if cv2.waitKey(1) & 0xff == 27:
            break

finally:
    # ストリーミング停止
    pipeline.stop()
    cv2.destroyAllWindows()

if name == 'main': test = realsence() print(test)

magisystem0408 commented 2 years ago

thank you. What has been improved in distance?  Is it accuracy? Could you send us a pull request?

ChenJiajun0011 commented 2 years ago

Hi, sorry for the late reply. I take a 2 days break. I have created a pull request for you. Hope to hear your feedback.

magisystem0408 commented 2 years ago

thanks.I will check pull request.

if you look it your repository , you will see this. Are you planning to introduce ros in c++ instead of python??? https://github.com/ChenJiajun0011/realsense-ros

ChenJiajun0011 commented 2 years ago

Nope, I am still using python. I am not good at c++.

magisystem0408 commented 2 years ago

In the first place, it seems difficult to run Ros with python. Build...etc

ChenJiajun0011 commented 2 years ago

Given my project about robot following, python is so far still ok for me. What do you mean by the difficulty about build? Could you give an example, what specifically this obstacle are?

magisystem0408 commented 2 years ago

for example Rewriting from c ++ to python. The python version of Ros is only compatible with 2.7....

ChenJiajun0011 commented 2 years ago

Hi, I have a question for you. Did our Yolov5Deepsort program use any GPU resource? We don't train any model. During the output of the object's box we just use the CPU, right? Here is the situation, I run this object recognition program on my laptop everything is fine, the image output is smooth(FPS=6). But when I run it on a robot's computer the recognition becomes slow, the FPS drop largely, the image output is not smooth. Will it be the CPU GPU problem? The robot's computer is quite good, way better than my laptop.

magisystem0408 commented 2 years ago

It will be considerably slower when calculated with the CPU

ChenJiajun0011 commented 2 years ago

deepsort = DeepSort(cfg.DEEPSORT.REID_CKPT, max_dist=cfg.DEEPSORT.MAX_DIST, min_confidence=cfg.DEEPSORT.MIN_CONFIDENCE, max_iou_distance=cfg.DEEPSORT.MAX_IOU_DISTANCE, max_age=cfg.DEEPSORT.MAX_AGE, n_init=cfg.DEEPSORT.N_INIT, nn_budget=cfg.DEEPSORT.NN_BUDGET, use_cuda=True)

device = select_device("cpu") half = device.type != "cpu" # half precision only supported on CUDA

model = attempt_load('yolov5/weights/yolov5s.pt', map_location=device) # load FP32 model stride = int(model.stride.max()) # model stride imgsz = check_img_size(640, s=stride) # check img_size names = model.module.names if hasattr(model, 'module') else model.names # get class names if half: model.half()

But this part here is mean that we choose to use the CPU right? In the realsense_track_person.py, you did not use the GPU, is that correct?

magisystem0408 commented 2 years ago

I did the calculations using the CPU.

ChenJiajun0011 commented 2 years ago

Yes, if I want to try it with GPU, you have any idea the code need to change, and what extra requirements need to be installed?

magisystem0408 commented 2 years ago

I'm going to try the GPU for a bit.

ChenJiajun0011 commented 2 years ago

Great, I am also working on that with windows.

ChenJiajun0011 commented 2 years ago

Let me know when you finish.

ChenJiajun0011 commented 2 years ago

In the realsense_track_person.py, Is it just change the line device = select_device("cpu") into device = select_device("0")? And then install CUDA and torch-gpu version, the other requirements in the requirements.txt don't need to change? I do this and I can run it.

magisystem0408 commented 2 years ago

please tell me the version of CUDA. Would you please send me a pull request for the GPU version of the code???

ChenJiajun0011 commented 2 years ago

Hi, It's better to use CUDA 10.2. Because the pytorch need to match the CUDA version, please check here: https://pytorch.org/get-started/previous-versions/. And that's the most important thing. Also, you need to install the newest driver for your nvidia card. Nothing else special. For the code, nothing match change, only here: device = select_device("cpu") change into: device = select_device("0") This is to select the gpu, You can try 0,1,2,3 Any problem happens, let me know.

magisystem0408 commented 2 years ago

excuse me. I was so busy that I couldn't see it

ChenJiajun0011 commented 2 years ago

Hi, I am wondering if you want to do any project about boston dynamic spot. Our team got a spot and other AGV platforms. If you want to try to program it and do some project on them, just contact me through chenjiajun1011@gmail.com. 1111

magisystem0408 commented 2 years ago

this one? you send image is cool https://youtu.be/M0fL5Q6rGws

github-actions[bot] commented 2 years ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs. Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!