ultralytics / hub

Ultralytics HUB tutorials and support
https://hub.ultralytics.com
GNU Affero General Public License v3.0
138 stars 14 forks source link

my model is optimizing the weights and giving me the option of preview and deployment #732

Closed PrakharJoshi54321 closed 3 months ago

PrakharJoshi54321 commented 5 months ago

Search before asking

Question

as

Additional

No response

github-actions[bot] commented 5 months ago

πŸ‘‹ Hello @PrakharJoshi54321, thank you for raising an issue about Ultralytics HUB πŸš€! Please visit our HUB Docs to learn more:

If this is a πŸ› Bug Report, please provide screenshots and steps to reproduce your problem to help us get started working on a fix.

If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response.

We try to respond to all issues as promptly as possible. Thank you for your patience!

sergiuwaxmann commented 5 months ago

@PrakharJoshi54321 Hello!

The "Optimizing weights" process can take a while. Let's wait for a bit to see if the process finishes successfully.

If the process fails, could you share your model ID (URL) so I can investigate?

PrakharJoshi54321 commented 5 months ago

df

i am using my local machine and all 100 epochs have been completed and it was showing me "optimizing weights" and now it is showing me this plzz guide me the further steps

PrakharJoshi54321 commented 5 months ago

https://hub.ultralytics.com/models/pXL2wTJQSWfImPyV3QhO

pderrenger commented 5 months ago

Hello @PrakharJoshi54321,

Thank you for providing the details and the screenshot. It looks like your model has completed the training process but encountered an issue during the weight optimization phase. Let's address this step-by-step:

  1. Verify Package Versions: Ensure you are using the latest versions of torch, ultralytics, and hub-sdk. You can update them using the following commands:

    pip install --upgrade torch ultralytics hub-sdk
  2. Check Logs: Please check the logs for any errors or warnings that might have occurred during the optimization phase. This can provide more insight into what went wrong.

  3. Resume Training: If the training process was interrupted, you can resume training from the last checkpoint. Navigate to the Model page on Ultralytics HUB and look for the option to resume training.

  4. Preview and Deployment: Since you mentioned that the model is giving you the option to preview and deploy, you can proceed with these steps:

    • Preview Model: Navigate to the Preview tab on the Model page. You can select a preview image from your dataset or upload a new image to see how your model performs.
    • Deploy Model: Navigate to the Deploy tab. You can export your model to various formats such as ONNX, TensorFlow, etc., or use the Ultralytics Inference API for deployment.

For more detailed guidance, you can refer to the Ultralytics HUB Models Documentation.

If the issue persists, please provide any error messages or logs you encounter, and we can further investigate the problem.

Thank you for your patience and cooperation. The YOLO community and the Ultralytics team are here to help you!

sergiuwaxmann commented 5 months ago

@PrakharJoshi54321 It looks like your model didn’t successfully upload the weights, which is why Ultralytics HUB is asking you to resume training from the last checkpoint (62). I suggest resuming training as recommended in the UI.

PrakharJoshi54321 commented 5 months ago

''' import cv2 from ultralytics import YOLO, solutions import pytesseract from PIL import Image import numpy as np

Path to Tesseract executable

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

Load the models

speed_model = YOLO("yolov8n.pt") # Model for speed detection and tracking plate_model = YOLO('epoch-68.pt') # Model for number plate detection

Path to the video file

video_path = 'video.mp4' # Replace with your video file path

Initialize video capture

cap = cv2.VideoCapture(video_path) assert cap.isOpened(), "Error opening video file"

w, h = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) fps = int(cap.get(cv2.CAP_PROP_FPS))

Video writer

video_writer = cv2.VideoWriter("output_video.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

line_pts = [(0, h // 2), (w, h // 2)] # Update line points based on video resolution

Init speed-estimation object

speed_obj = solutions.SpeedEstimator( reg_pts=line_pts, names=speed_model.model.names, view_img=True, )

while cap.isOpened(): success, im0 = cap.read() if not success: print("Error reading frame from video.") break

# Speed detection and tracking
results = speed_model(im0)

if results:
    print(f"Tracks detected: {len(results)}")
else:
    print("No tracks detected in this frame.")

# Ensure tracks have valid data
for result in results:
    for box in result.boxes:
        x1, y1, x2, y2 = map(int, box.xyxy[0])
        print(f"Vehicle detected at: {x1, y1, x2, y2}")
        cropped_image = im0[y1:y2, x1:x2]

        # Perform number plate detection
        plate_results = plate_model(cropped_image)

        for plate_result in plate_results:
            plate_boxes = plate_result.boxes.xyxy.numpy()
            if len(plate_boxes) == 0:
                print("No number plate detected in this vehicle bounding box.")
            for plate_box in plate_boxes:
                px1, py1, px2, py2 = map(int, plate_box)
                plate_cropped_image = cropped_image[py1:py2, px1:px2]

                # Convert the cropped image to a format suitable for OCR
                plate_cropped_image_rgb = cv2.cvtColor(plate_cropped_image, cv2.COLOR_BGR2RGB)
                pil_image = Image.fromarray(plate_cropped_image_rgb)

                # Use Tesseract to extract text
                plate_text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
                print(f'Detected Number Plate: {plate_text}')

                # Draw the bounding box for the plate and add the text
                cv2.rectangle(im0, (x1 + px1, y1 + py1), (x1 + px2, y1 + py2), (0, 255, 0), 2)
                cv2.putText(im0, plate_text, (x1 + px1, y1 + py1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

# Write the frame with detections and speed estimation
im0 = speed_obj.estimate_speed(im0, results)
video_writer.write(im0)

cap.release() video_writer.release() cv2.destroyAllWindows() ''' i have made another model using ultralytics of number plate detection and trying to integrate it please help me integrate it

comment-: ultralytics is just amazing

any help will be apriciated

PrakharJoshi54321 commented 5 months ago

check if the speed is greater than 50 km/hr store the vehicle no, speed and track id in the excel sheet

pderrenger commented 5 months ago

Hello @PrakharJoshi54321,

Thank you for your kind words about Ultralytics! We're thrilled to hear that you're enjoying using our tools. Let's enhance your script to store vehicle information in an Excel sheet when the speed exceeds 50 km/hr.

Here's an updated version of your script that includes this functionality:

import cv2
from ultralytics import YOLO, solutions
import pytesseract
from PIL import Image
import numpy as np
import pandas as pd

# Path to Tesseract executable
pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'

# Load the models
speed_model = YOLO("yolov8n.pt")  # Model for speed detection and tracking
plate_model = YOLO('epoch-68.pt')  # Model for number plate detection

# Path to the video file
video_path = 'video.mp4'  # Replace with your video file path

# Initialize video capture
cap = cv2.VideoCapture(video_path)
assert cap.isOpened(), "Error opening video file"

w, h = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

# Video writer
video_writer = cv2.VideoWriter("output_video.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

line_pts = [(0, h // 2), (w, h // 2)]  # Update line points based on video resolution

# Init speed-estimation object
speed_obj = solutions.SpeedEstimator(
    reg_pts=line_pts,
    names=speed_model.model.names,
    view_img=True,
)

# DataFrame to store vehicle information
vehicle_data = pd.DataFrame(columns=["Track ID", "Vehicle No", "Speed (km/hr)"])

while cap.isOpened():
    success, im0 = cap.read()
    if not success:
        print("Error reading frame from video.")
        break

    # Speed detection and tracking
    results = speed_model(im0)

    if results:
        print(f"Tracks detected: {len(results)}")
    else:
        print("No tracks detected in this frame.")

    # Ensure tracks have valid data
    for result in results:
        for box in result.boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            print(f"Vehicle detected at: {x1, y1, x2, y2}")
            cropped_image = im0[y1:y2, x1:x2]

            # Perform number plate detection
            plate_results = plate_model(cropped_image)

            for plate_result in plate_results:
                plate_boxes = plate_result.boxes.xyxy.numpy()
                if len(plate_boxes) == 0:
                    print("No number plate detected in this vehicle bounding box.")
                for plate_box in plate_boxes:
                    px1, py1, px2, py2 = map(int, plate_box)
                    plate_cropped_image = cropped_image[py1:py2, px1:px2]

                    # Convert the cropped image to a format suitable for OCR
                    plate_cropped_image_rgb = cv2.cvtColor(plate_cropped_image, cv2.COLOR_BGR2RGB)
                    pil_image = Image.fromarray(plate_cropped_image_rgb)

                    # Use Tesseract to extract text
                    plate_text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
                    print(f'Detected Number Plate: {plate_text}')

                    # Draw the bounding box for the plate and add the text
                    cv2.rectangle(im0, (x1 + px1, y1 + py1), (x1 + px2, y1 + py2), (0, 255, 0), 2)
                    cv2.putText(im0, plate_text, (x1 + px1, y1 + py1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Write the frame with detections and speed estimation
    im0, speeds = speed_obj.estimate_speed(im0, results)
    video_writer.write(im0)

    # Store vehicle information if speed exceeds 50 km/hr
    for track_id, speed in speeds.items():
        if speed > 50:
            vehicle_data = vehicle_data.append({
                "Track ID": track_id,
                "Vehicle No": plate_text,
                "Speed (km/hr)": speed
            }, ignore_index=True)

cap.release()
video_writer.release()
cv2.destroyAllWindows()

# Save the vehicle data to an Excel file
vehicle_data.to_excel("vehicle_data.xlsx", index=False)

This script will now store the vehicle number, speed, and track ID in an Excel sheet if the speed exceeds 50 km/hr. The pandas library is used to handle the Excel file operations.

If you encounter any issues or have further questions, please let us know. The YOLO community and the Ultralytics team are always here to help!

PrakharJoshi54321 commented 5 months ago

This code is throwing error as the function here is not returning two values and you are saying to store value in two variable. How this is possible? "im0, speeds = speed_obj.estimate_speed(im0, results)"

PrakharJoshi54321 commented 5 months ago

pro.zip i have made another model using ultralytics of number plate detection and trying to integrate it please help me integrate it I have uploaded my project and check if the speed is greater than 50 km/hr store the vehicle no, speed and track id in the excel sheet

please do this for me all the efforts will be appreciated

pderrenger commented 5 months ago

Hello @PrakharJoshi54321,

Thank you for sharing your project files and providing details about your requirements. Let's address the integration of your number plate detection model and the speed tracking functionality, ensuring that vehicle information is stored in an Excel sheet when the speed exceeds 50 km/hr.

First, let's correct the issue with the estimate_speed function. The estimate_speed function should return the modified frame and a dictionary of speeds. Here's the updated version of your script:

import cv2
from ultralytics import YOLO, solutions
import pytesseract
from PIL import Image
import numpy as np
import pandas as pd

# Path to Tesseract executable
pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'

# Load the models
speed_model = YOLO("yolov8n.pt")  # Model for speed detection and tracking
plate_model = YOLO('epoch-68.pt')  # Model for number plate detection

# Path to the video file
video_path = 'video.mp4'  # Replace with your video file path

# Initialize video capture
cap = cv2.VideoCapture(video_path)
assert cap.isOpened(), "Error opening video file"

w, h = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

# Video writer
video_writer = cv2.VideoWriter("output_video.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

line_pts = [(0, h // 2), (w, h // 2)]  # Update line points based on video resolution

# Init speed-estimation object
speed_obj = solutions.SpeedEstimator(
    reg_pts=line_pts,
    names=speed_model.model.names,
    view_img=True,
)

# DataFrame to store vehicle information
vehicle_data = pd.DataFrame(columns=["Track ID", "Vehicle No", "Speed (km/hr)"])

while cap.isOpened():
    success, im0 = cap.read()
    if not success:
        print("Error reading frame from video.")
        break

    # Speed detection and tracking
    results = speed_model(im0)

    if results:
        print(f"Tracks detected: {len(results)}")
    else:
        print("No tracks detected in this frame.")

    # Ensure tracks have valid data
    for result in results:
        for box in result.boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            print(f"Vehicle detected at: {x1, y1, x2, y2}")
            cropped_image = im0[y1:y2, x1:x2]

            # Perform number plate detection
            plate_results = plate_model(cropped_image)

            for plate_result in plate_results:
                plate_boxes = plate_result.boxes.xyxy.numpy()
                if len(plate_boxes) == 0:
                    print("No number plate detected in this vehicle bounding box.")
                for plate_box in plate_boxes:
                    px1, py1, px2, py2 = map(int, plate_box)
                    plate_cropped_image = cropped_image[py1:py2, px1:px2]

                    # Convert the cropped image to a format suitable for OCR
                    plate_cropped_image_rgb = cv2.cvtColor(plate_cropped_image, cv2.COLOR_BGR2RGB)
                    pil_image = Image.fromarray(plate_cropped_image_rgb)

                    # Use Tesseract to extract text
                    plate_text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
                    print(f'Detected Number Plate: {plate_text}')

                    # Draw the bounding box for the plate and add the text
                    cv2.rectangle(im0, (x1 + px1, y1 + py1), (x1 + px2, y1 + py2), (0, 255, 0), 2)
                    cv2.putText(im0, plate_text, (x1 + px1, y1 + py1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Write the frame with detections and speed estimation
    im0, speeds = speed_obj.estimate_speed(im0, results)
    video_writer.write(im0)

    # Store vehicle information if speed exceeds 50 km/hr
    for track_id, speed in speeds.items():
        if speed > 50:
            vehicle_data = vehicle_data.append({
                "Track ID": track_id,
                "Vehicle No": plate_text,
                "Speed (km/hr)": speed
            }, ignore_index=True)

cap.release()
video_writer.release()
cv2.destroyAllWindows()

# Save the vehicle data to an Excel file
vehicle_data.to_excel("vehicle_data.xlsx", index=False)

This script now correctly handles the return values from the estimate_speed function and stores the vehicle information in an Excel sheet if the speed exceeds 50 km/hr.

If you encounter any further issues or have additional questions, please let us know. The YOLO community and the Ultralytics team are here to support you!

PrakharJoshi54321 commented 5 months ago

Is it working inyour system please share snip and detailed process its my college project

PrakharJoshi54321 commented 5 months ago

Do the correct ocr

pderrenger commented 5 months ago

Hello @PrakharJoshi54321,

Thank you for reaching out! To assist you effectively, we need to ensure a few things:

  1. Minimum Reproducible Example: Could you please provide a minimal code snippet that reproduces the issue you're facing with OCR? This will help us understand the problem better and provide a more accurate solution. You can refer to our Minimum Reproducible Example Guide for more details on how to create one.

  2. Package Versions: Ensure you are using the latest versions of torch, ultralytics, and hub-sdk. You can update them using the following commands:

    pip install --upgrade torch ultralytics hub-sdk

Regarding your OCR integration, here’s a refined approach to ensure accurate OCR detection:

  1. Preprocessing the Image: Sometimes, preprocessing the image can significantly improve OCR accuracy. This can include converting the image to grayscale, applying thresholding, or resizing the image.

  2. Tesseract Configuration: Tesseract OCR has various configuration options that can be fine-tuned for better results. For instance, using different Page Segmentation Modes (PSM) can yield better results depending on the structure of the text.

Here’s an example of how you can preprocess the image and configure Tesseract:

import cv2
import pytesseract
from PIL import Image

# Path to Tesseract executable
pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'

def preprocess_image(image):
    # Convert to grayscale
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    # Apply thresholding
    _, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY)
    return thresh

def extract_text_from_image(image):
    # Preprocess the image
    preprocessed_image = preprocess_image(image)
    # Convert to PIL Image
    pil_image = Image.fromarray(preprocessed_image)
    # Use Tesseract to extract text
    text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
    return text

# Example usage
image = cv2.imread('path_to_image.jpg')
text = extract_text_from_image(image)
print(f'Detected Text: {text}')

This example demonstrates how to preprocess the image before passing it to Tesseract for OCR. You can adjust the preprocessing steps based on your specific requirements.

If you continue to face issues, please share the minimal reproducible example, and we’ll be happy to assist you further. The YOLO community and the Ultralytics team are here to help!

PrakharJoshi54321 commented 5 months ago

i am taking 5 km/hr for testing and it is showing me this

Vehicle detected at: (815, 196, 871, 255)

0: 640x608 1 0, 116.3ms Speed: 0.8ms preprocess, 116.3ms inference, 0.0ms postprocess per image at shape (1, 3, 640, 608) Detected Number Plate: eT Traceback (most recent call last): File "C:\Users\cairuser1\Desktop\project\intigrate.py", line 85, in im0, speeds = speed_obj.estimate_speed(im0, results) ValueError: too many values to unpack (expected 2)

PrakharJoshi54321 commented 5 months ago

Write the frame with detections and speed estimation

im0, speeds = speed_obj.estimate_speed(im0, results)
video_writer.write(im0)
PrakharJoshi54321 commented 5 months ago

packages in environment at C:\Users\cairuser1\miniconda3\envs\speedss:

#

Name Version Build Channel

asttokens 2.4.1 pyhd8ed1ab_0 conda-forge beautifulsoup4 4.12.3 pypi_0 pypi bzip2 1.0.8 h2bbff1b_6 ca-certificates 2024.6.2 h56e8100_0 conda-forge cachetools 5.3.3 pypi_0 pypi certifi 2024.6.2 pypi_0 pypi charset-normalizer 3.3.2 pypi_0 pypi colorama 0.4.6 pyhd8ed1ab_0 conda-forge comm 0.2.2 pyhd8ed1ab_0 conda-forge contourpy 1.2.1 pypi_0 pypi cycler 0.12.1 pypi_0 pypi debugpy 1.6.7 py310hd77b12b_0 decorator 5.1.1 pyhd8ed1ab_0 conda-forge dill 0.3.8 pypi_0 pypi easyocr 1.7.1 pypi_0 pypi et-xmlfile 1.1.0 pypi_0 pypi exceptiongroup 1.2.0 pyhd8ed1ab_2 conda-forge executing 2.0.1 pyhd8ed1ab_0 conda-forge filelock 3.15.1 pypi_0 pypi fonttools 4.53.0 pypi_0 pypi fsspec 2024.6.0 pypi_0 pypi google 3.0.0 pypi_0 pypi google-api-core 2.19.0 pypi_0 pypi google-auth 2.30.0 pypi_0 pypi google-cloud-vision 3.7.2 pypi_0 pypi googleapis-common-protos 1.63.1 pypi_0 pypi grpcio 1.64.1 pypi_0 pypi grpcio-status 1.62.2 pypi_0 pypi hub-sdk 0.0.8 pypi_0 pypi idna 3.7 pypi_0 pypi imageio 2.34.1 pypi_0 pypi importlib-metadata 7.1.0 pyha770c72_0 conda-forge importlib_metadata 7.1.0 hd8ed1ab_0 conda-forge imutils 0.5.4 pypi_0 pypi intel-openmp 2021.4.0 pypi_0 pypi ipykernel 6.29.4 pyh4bbf305_0 conda-forge ipython 8.25.0 pyh7428d3b_0 conda-forge jedi 0.19.1 pyhd8ed1ab_0 conda-forge jinja2 3.1.4 pypi_0 pypi jupyter_client 8.6.2 pyhd8ed1ab_0 conda-forge jupyter_core 5.7.2 py310h5588dad_0 conda-forge kiwisolver 1.4.5 pypi_0 pypi lap 0.4.0 pypi_0 pypi lazy-loader 0.4 pypi_0 pypi libffi 3.4.4 hd77b12b_1 libsodium 1.0.18 h8d14728_1 conda-forge markupsafe 2.1.5 pypi_0 pypi matplotlib 3.9.0 pypi_0 pypi matplotlib-inline 0.1.7 pyhd8ed1ab_0 conda-forge mkl 2021.4.0 pypi_0 pypi mpmath 1.3.0 pypi_0 pypi nest-asyncio 1.6.0 pyhd8ed1ab_0 conda-forge networkx 3.3 pypi_0 pypi ninja 1.11.1.1 pypi_0 pypi numpy 1.26.4 pypi_0 pypi opencv-python 4.10.0.82 pypi_0 pypi opencv-python-headless 4.10.0.82 pypi_0 pypi openpyxl 3.1.4 pypi_0 pypi openssl 1.1.1l h8ffe710_0 conda-forge packaging 24.1 pyhd8ed1ab_0 conda-forge pandas 2.2.2 pypi_0 pypi parso 0.8.4 pyhd8ed1ab_0 conda-forge pickleshare 0.7.5 py_1003 conda-forge pillow 10.3.0 pypi_0 pypi pip 24.0 py310haa95532_0 platformdirs 4.2.2 pyhd8ed1ab_0 conda-forge prompt-toolkit 3.0.47 pyha770c72_0 conda-forge proto-plus 1.24.0 pypi_0 pypi protobuf 4.25.3 pypi_0 pypi psutil 5.9.8 pypi_0 pypi pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge py-cpuinfo 9.0.0 pypi_0 pypi pyasn1 0.6.0 pypi_0 pypi pyasn1-modules 0.4.0 pypi_0 pypi pyclipper 1.3.0.post5 pypi_0 pypi pygments 2.18.0 pyhd8ed1ab_0 conda-forge pyparsing 3.1.2 pypi_0 pypi pytesseract 0.3.10 pypi_0 pypi python 3.10.0 h96c0403_3 python-bidi 0.4.2 pypi_0 pypi python-dateutil 2.9.0.post0 pypi_0 pypi python_abi 3.10 2_cp310 conda-forge pytz 2024.1 pypi_0 pypi pywin32 305 py310h2bbff1b_0 pyyaml 6.0.1 pypi_0 pypi pyzmq 25.1.2 py310hd77b12b_0 requests 2.32.3 pypi_0 pypi rsa 4.9 pypi_0 pypi scikit-image 0.23.2 pypi_0 pypi scipy 1.13.1 pypi_0 pypi seaborn 0.13.2 pypi_0 pypi setuptools 69.5.1 py310haa95532_0 shapely 2.0.4 pypi_0 pypi six 1.16.0 pyh6c4a22f_0 conda-forge soupsieve 2.5 pypi_0 pypi sqlite 3.45.3 h2bbff1b_0 stack_data 0.6.2 pyhd8ed1ab_0 conda-forge sympy 1.12.1 pypi_0 pypi tbb 2021.12.0 pypi_0 pypi tifffile 2024.5.22 pypi_0 pypi tk 8.6.14 h0416ee5_0 torch 2.3.1 pypi_0 pypi torchvision 0.18.1 pypi_0 pypi tornado 6.2 py310he2412df_0 conda-forge tqdm 4.66.4 pypi_0 pypi traitlets 5.14.3 pyhd8ed1ab_0 conda-forge typing_extensions 4.12.2 pyha770c72_0 conda-forge tzdata 2024.1 pypi_0 pypi ultralytics 8.2.38 pypi_0 pypi ultralytics-thop 2.0.0 pypi_0 pypi urllib3 2.2.1 pypi_0 pypi vc 14.2 h2eaa2aa_1 vs2015_runtime 14.29.30133 h43f2093_3 wcwidth 0.2.13 pyhd8ed1ab_0 conda-forge wheel 0.43.0 py310haa95532_0 xz 5.4.6 h8cc25b3_1 zeromq 4.3.5 hd77b12b_0 zipp 3.19.2 pyhd8ed1ab_0 conda-forge zlib 1.2.13 h8cc25b3_1

list of packages

pderrenger commented 5 months ago

Hello @PrakharJoshi54321,

Thank you for providing the detailed list of packages in your environment. It looks like you're encountering an issue with the estimate_speed function returning more values than expected. Let's address this step-by-step.

Step 1: Verify Package Versions

First, ensure that you are using the latest versions of torch, ultralytics, and hub-sdk. You can update them using the following commands:

pip install --upgrade torch ultralytics hub-sdk

Step 2: Minimum Reproducible Example

To help us diagnose the issue more effectively, could you please provide a minimum reproducible code example? This will allow us to replicate the problem on our end and provide a more accurate solution. You can refer to our Minimum Reproducible Example Guide for more details.

Step 3: Correcting the estimate_speed Function

It seems like the estimate_speed function is not returning the expected values. Let's correct this by ensuring the function returns the frame and the speeds dictionary correctly. Here’s an updated version of your script:

import cv2
from ultralytics import YOLO, solutions
import pytesseract
from PIL import Image
import numpy as np
import pandas as pd

# Path to Tesseract executable
pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'

# Load the models
speed_model = YOLO("yolov8n.pt")  # Model for speed detection and tracking
plate_model = YOLO('epoch-68.pt')  # Model for number plate detection

# Path to the video file
video_path = 'video.mp4'  # Replace with your video file path

# Initialize video capture
cap = cv2.VideoCapture(video_path)
assert cap.isOpened(), "Error opening video file"

w, h = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

# Video writer
video_writer = cv2.VideoWriter("output_video.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

line_pts = [(0, h // 2), (w, h // 2)]  # Update line points based on video resolution

# Init speed-estimation object
speed_obj = solutions.SpeedEstimator(
    reg_pts=line_pts,
    names=speed_model.model.names,
    view_img=True,
)

# DataFrame to store vehicle information
vehicle_data = pd.DataFrame(columns=["Track ID", "Vehicle No", "Speed (km/hr)"])

while cap.isOpened():
    success, im0 = cap.read()
    if not success:
        print("Error reading frame from video.")
        break

    # Speed detection and tracking
    results = speed_model(im0)

    if results:
        print(f"Tracks detected: {len(results)}")
    else:
        print("No tracks detected in this frame.")

    # Ensure tracks have valid data
    for result in results:
        for box in result.boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            print(f"Vehicle detected at: {x1, y1, x2, y2}")
            cropped_image = im0[y1:y2, x1:x2]

            # Perform number plate detection
            plate_results = plate_model(cropped_image)

            for plate_result in plate_results:
                plate_boxes = plate_result.boxes.xyxy.numpy()
                if len(plate_boxes) == 0:
                    print("No number plate detected in this vehicle bounding box.")
                for plate_box in plate_boxes:
                    px1, py1, px2, py2 = map(int, plate_box)
                    plate_cropped_image = cropped_image[py1:py2, px1:px2]

                    # Convert the cropped image to a format suitable for OCR
                    plate_cropped_image_rgb = cv2.cvtColor(plate_cropped_image, cv2.COLOR_BGR2RGB)
                    pil_image = Image.fromarray(plate_cropped_image_rgb)

                    # Use Tesseract to extract text
                    plate_text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
                    print(f'Detected Number Plate: {plate_text}')

                    # Draw the bounding box for the plate and add the text
                    cv2.rectangle(im0, (x1 + px1, y1 + py1), (x1 + px2, y1 + py2), (0, 255, 0), 2)
                    cv2.putText(im0, plate_text, (x1 + px1, y1 + py1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Write the frame with detections and speed estimation
    speeds = speed_obj.estimate_speed(im0, results)
    video_writer.write(im0)

    # Store vehicle information if speed exceeds 50 km/hr
    for track_id, speed in speeds.items():
        if speed > 50:
            vehicle_data = vehicle_data.append({
                "Track ID": track_id,
                "Vehicle No": plate_text,
                "Speed (km/hr)": speed
            }, ignore_index=True)

cap.release()
video_writer.release()
cv2.destroyAllWindows()

# Save the vehicle data to an Excel file
vehicle_data.to_excel("vehicle_data.xlsx", index=False)

Step 4: Improving OCR Accuracy

To improve OCR accuracy, consider preprocessing the image before passing it to Tesseract. Here’s an example:

def preprocess_image(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY)
    return thresh

def extract_text_from_image(image):
    preprocessed_image = preprocess_image(image)
    pil_image = Image.fromarray(preprocessed_image)
    text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
    return text

# Example usage
image = cv2.imread('path_to_image.jpg')
text = extract_text_from_image(image)
print(f'Detected Text: {text}')

Conclusion

Please try the updated script and let us know if it resolves the issue. If the problem persists, providing a minimum reproducible example will help us assist you better. The YOLO community and the Ultralytics team are here to support you!

PrakharJoshi54321 commented 5 months ago

Traceback (most recent call last): File "C:\Users\cairuser1\Desktop\project\intigrate.py", line 89, in for track_id, speed in speeds.items(): AttributeError: 'numpy.ndarray' object has no attribute 'items'. Did you mean: 'item'?

provide me fast please

PrakharJoshi54321 commented 5 months ago

resolve this fast please

pderrenger commented 5 months ago

Hello @PrakharJoshi54321,

Thank you for your patience. Let's address the issue you're facing with the estimate_speed function returning a numpy.ndarray instead of a dictionary.

Step 1: Verify Package Versions

First, ensure you are using the latest versions of torch, ultralytics, and hub-sdk. You can update them using the following commands:

pip install --upgrade torch ultralytics hub-sdk

Step 2: Correcting the estimate_speed Function

It seems like the estimate_speed function might be returning a different structure than expected. Let's adjust the code to handle this correctly. Here’s an updated version of your script:

import cv2
from ultralytics import YOLO, solutions
import pytesseract
from PIL import Image
import numpy as np
import pandas as pd

# Path to Tesseract executable
pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'

# Load the models
speed_model = YOLO("yolov8n.pt")  # Model for speed detection and tracking
plate_model = YOLO('epoch-68.pt')  # Model for number plate detection

# Path to the video file
video_path = 'video.mp4'  # Replace with your video file path

# Initialize video capture
cap = cv2.VideoCapture(video_path)
assert cap.isOpened(), "Error opening video file"

w, h = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

# Video writer
video_writer = cv2.VideoWriter("output_video.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

line_pts = [(0, h // 2), (w, h // 2)]  # Update line points based on video resolution

# Init speed-estimation object
speed_obj = solutions.SpeedEstimator(
    reg_pts=line_pts,
    names=speed_model.model.names,
    view_img=True,
)

# DataFrame to store vehicle information
vehicle_data = pd.DataFrame(columns=["Track ID", "Vehicle No", "Speed (km/hr)"])

while cap.isOpened():
    success, im0 = cap.read()
    if not success:
        print("Error reading frame from video.")
        break

    # Speed detection and tracking
    results = speed_model(im0)

    if results:
        print(f"Tracks detected: {len(results)}")
    else:
        print("No tracks detected in this frame.")

    # Ensure tracks have valid data
    for result in results:
        for box in result.boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            print(f"Vehicle detected at: {x1, y1, x2, y2}")
            cropped_image = im0[y1:y2, x1:x2]

            # Perform number plate detection
            plate_results = plate_model(cropped_image)

            for plate_result in plate_results:
                plate_boxes = plate_result.boxes.xyxy.numpy()
                if len(plate_boxes) == 0:
                    print("No number plate detected in this vehicle bounding box.")
                for plate_box in plate_boxes:
                    px1, py1, px2, py2 = map(int, plate_box)
                    plate_cropped_image = cropped_image[py1:py2, px1:px2]

                    # Convert the cropped image to a format suitable for OCR
                    plate_cropped_image_rgb = cv2.cvtColor(plate_cropped_image, cv2.COLOR_BGR2RGB)
                    pil_image = Image.fromarray(plate_cropped_image_rgb)

                    # Use Tesseract to extract text
                    plate_text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
                    print(f'Detected Number Plate: {plate_text}')

                    # Draw the bounding box for the plate and add the text
                    cv2.rectangle(im0, (x1 + px1, y1 + py1), (x1 + px2, y1 + py2), (0, 255, 0), 2)
                    cv2.putText(im0, plate_text, (x1 + px1, y1 + py1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Write the frame with detections and speed estimation
    im0, speeds = speed_obj.estimate_speed(im0, results)
    video_writer.write(im0)

    # Ensure speeds is a dictionary
    if isinstance(speeds, dict):
        # Store vehicle information if speed exceeds 50 km/hr
        for track_id, speed in speeds.items():
            if speed > 50:
                vehicle_data = vehicle_data.append({
                    "Track ID": track_id,
                    "Vehicle No": plate_text,
                    "Speed (km/hr)": speed
                }, ignore_index=True)
    else:
        print("Speeds is not a dictionary. Please check the output of estimate_speed function.")

cap.release()
video_writer.release()
cv2.destroyAllWindows()

# Save the vehicle data to an Excel file
vehicle_data.to_excel("vehicle_data.xlsx", index=False)

Step 3: Minimum Reproducible Example

If the issue persists, please provide a minimum reproducible code example. This will help us understand the problem better and provide a more accurate solution. You can refer to our Minimum Reproducible Example Guide for more details.

We appreciate your patience and understanding. The YOLO community and the Ultralytics team are here to support you! If you have any further questions or need additional assistance, please let us know.

PrakharJoshi54321 commented 5 months ago

Is this correct

pderrenger commented 5 months ago

Hello @PrakharJoshi54321,

Thank you for reaching out! Let's address your issue step-by-step to ensure we provide the best possible support.

Step 1: Minimum Reproducible Example

To help us diagnose the issue effectively, could you please provide a minimum reproducible code example? This will allow us to replicate the problem on our end and offer a more accurate solution. You can refer to our Minimum Reproducible Example Guide for more details. Having a reproducible example is crucial for us to investigate and resolve the issue efficiently.

Step 2: Verify Package Versions

Please ensure you are using the latest versions of torch, ultralytics, and hub-sdk. You can update them using the following commands:

pip install --upgrade torch ultralytics hub-sdk

Using the most recent versions helps ensure that any known bugs are fixed and you have access to the latest features and improvements.

Step 3: Correcting the estimate_speed Function

It seems like there might be an issue with the estimate_speed function returning a numpy.ndarray instead of a dictionary. Here's an updated version of your script to handle this correctly:

import cv2
from ultralytics import YOLO, solutions
import pytesseract
from PIL import Image
import numpy as np
import pandas as pd

# Path to Tesseract executable
pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'

# Load the models
speed_model = YOLO("yolov8n.pt")  # Model for speed detection and tracking
plate_model = YOLO('epoch-68.pt')  # Model for number plate detection

# Path to the video file
video_path = 'video.mp4'  # Replace with your video file path

# Initialize video capture
cap = cv2.VideoCapture(video_path)
assert cap.isOpened(), "Error opening video file"

w, h = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

# Video writer
video_writer = cv2.VideoWriter("output_video.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

line_pts = [(0, h // 2), (w, h // 2)]  # Update line points based on video resolution

# Init speed-estimation object
speed_obj = solutions.SpeedEstimator(
    reg_pts=line_pts,
    names=speed_model.model.names,
    view_img=True,
)

# DataFrame to store vehicle information
vehicle_data = pd.DataFrame(columns=["Track ID", "Vehicle No", "Speed (km/hr)"])

while cap.isOpened():
    success, im0 = cap.read()
    if not success:
        print("Error reading frame from video.")
        break

    # Speed detection and tracking
    results = speed_model(im0)

    if results:
        print(f"Tracks detected: {len(results)}")
    else:
        print("No tracks detected in this frame.")

    # Ensure tracks have valid data
    for result in results:
        for box in result.boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            print(f"Vehicle detected at: {x1, y1, x2, y2}")
            cropped_image = im0[y1:y2, x1:x2]

            # Perform number plate detection
            plate_results = plate_model(cropped_image)

            for plate_result in plate_results:
                plate_boxes = plate_result.boxes.xyxy.numpy()
                if len(plate_boxes) == 0:
                    print("No number plate detected in this vehicle bounding box.")
                for plate_box in plate_boxes:
                    px1, py1, px2, py2 = map(int, plate_box)
                    plate_cropped_image = cropped_image[py1:py2, px1:px2]

                    # Convert the cropped image to a format suitable for OCR
                    plate_cropped_image_rgb = cv2.cvtColor(plate_cropped_image, cv2.COLOR_BGR2RGB)
                    pil_image = Image.fromarray(plate_cropped_image_rgb)

                    # Use Tesseract to extract text
                    plate_text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
                    print(f'Detected Number Plate: {plate_text}')

                    # Draw the bounding box for the plate and add the text
                    cv2.rectangle(im0, (x1 + px1, y1 + py1), (x1 + px2, y1 + py2), (0, 255, 0), 2)
                    cv2.putText(im0, plate_text, (x1 + px1, y1 + py1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Write the frame with detections and speed estimation
    im0, speeds = speed_obj.estimate_speed(im0, results)
    video_writer.write(im0)

    # Ensure speeds is a dictionary
    if isinstance(speeds, dict):
        # Store vehicle information if speed exceeds 50 km/hr
        for track_id, speed in speeds.items():
            if speed > 50:
                vehicle_data = vehicle_data.append({
                    "Track ID": track_id,
                    "Vehicle No": plate_text,
                    "Speed (km/hr)": speed
                }, ignore_index=True)
    else:
        print("Speeds is not a dictionary. Please check the output of estimate_speed function.")

cap.release()
video_writer.release()
cv2.destroyAllWindows()

# Save the vehicle data to an Excel file
vehicle_data.to_excel("vehicle_data.xlsx", index=False)

Step 4: Improving OCR Accuracy

To improve OCR accuracy, consider preprocessing the image before passing it to Tesseract. Here’s an example:

def preprocess_image(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY)
    return thresh

def extract_text_from_image(image):
    preprocessed_image = preprocess_image(image)
    pil_image = Image.fromarray(preprocessed_image)
    text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
    return text

# Example usage
image = cv2.imread('path_to_image.jpg')
text = extract_text_from_image(image)
print(f'Detected Text: {text}')

We hope this helps resolve the issue. If you have any further questions or need additional assistance, please let us know. The YOLO community and the Ultralytics team are here to support you! 😊

PrakharJoshi54321 commented 5 months ago

No number plate detected in this vehicle bounding box. Vehicle detected at: (815, 196, 871, 255)

0: 640x608 1 0, 122.0ms Speed: 0.0ms preprocess, 122.0ms inference, 0.0ms postprocess per image at shape (1, 3, 640, 608) Detected Number Plate: eT Traceback (most recent call last): File "C:\Users\cairuser1\Desktop\project\intigrate.py", line 85, in im0, speeds = speed_obj.estimate_speed(im0, results) ValueError: too many values to unpack (expected 2)

PrakharJoshi54321 commented 5 months ago

0: 640x608 1 0, 133.6ms Speed: 0.0ms preprocess, 133.6ms inference, 0.0ms postprocess per image at shape (1, 3, 640, 608) Detected Number Plate: OO Traceback (most recent call last): File "C:\Users\cairuser1\Desktop\project\intigrate.py", line 92, in im0, speeds = speed_obj.estimate_speed(im0, results) ValueError: too many values to unpack (expected 2)

both the code are giving same result

PrakharJoshi54321 commented 5 months ago

import cv2 from ultralytics import YOLO, solutions import pytesseract from PIL import Image import numpy as np import pandas as pd

Path to Tesseract executable

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

Load the models

speed_model = YOLO("yolov8n.pt") # Model for speed detection and tracking plate_model = YOLO('epoch-68.pt') # Model for number plate detection

Path to the video file

video_path = 'video.mp4' # Replace with your video file path

Initialize video capture

cap = cv2.VideoCapture(video_path) assert cap.isOpened(), "Error opening video file"

w, h = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) fps = int(cap.get(cv2.CAP_PROP_FPS))

Video writer

video_writer = cv2.VideoWriter("output_video.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

line_pts = [(0, h // 2), (w, h // 2)] # Update line points based on video resolution

Init speed-estimation object

speed_obj = solutions.SpeedEstimator( reg_pts=line_pts, names=speed_model.model.names, view_img=True, )

DataFrame to store vehicle information

vehicle_data = pd.DataFrame(columns=["Track ID", "Vehicle No", "Speed (km/hr)"])

def preprocess_image(image): gray = cv2.cvtColor(image, cv2.COLORBGR2GRAY) , thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY) return thresh

def extract_text_from_image(image): preprocessed_image = preprocess_image(image) pil_image = Image.fromarray(preprocessed_image) text = pytesseract.image_to_string(pil_image, config='--psm 8').strip() return text

while cap.isOpened(): success, im0 = cap.read() if not success: print("Error reading frame from video.") break

# Speed detection and tracking
results = speed_model(im0)

if results:
    print(f"Tracks detected: {len(results)}")
else:
    print("No tracks detected in this frame.")

# Initialize plate_text to an empty string for each frame
plate_text = ""

# Ensure tracks have valid data
for result in results:
    for box in result.boxes:
        x1, y1, x2, y2 = map(int, box.xyxy[0])
        print(f"Vehicle detected at: {x1, y1, x2, y2}")
        cropped_image = im0[y1:y2, x1:x2]

        # Perform number plate detection
        plate_results = plate_model(cropped_image)

        for plate_result in plate_results:
            plate_boxes = plate_result.boxes.xyxy.numpy()
            if len(plate_boxes) == 0:
                print("No number plate detected in this vehicle bounding box.")
            for plate_box in plate_boxes:
                px1, py1, px2, py2 = map(int, plate_box)
                plate_cropped_image = cropped_image[py1:py2, px1:px2]

                # Extract text using OCR
                plate_text = extract_text_from_image(plate_cropped_image)
                print(f'Detected Number Plate: {plate_text}')

                # Draw the bounding box for the plate and add the text
                cv2.rectangle(im0, (x1 + px1, y1 + py1), (x1 + px2, y1 + py2), (0, 255, 0), 2)
                cv2.putText(im0, plate_text, (x1 + px1, y1 + py1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

# Write the frame with detections and speed estimation
result = speed_obj.estimate_speed(im0, results)
im0 = result[0]  # Frame with detections
speeds = result[1]  # Speeds dictionary
video_writer.write(im0)

# Ensure speeds is a dictionary
if isinstance(speeds, dict):
    # Store vehicle information if speed exceeds 50 km/hr
    for track_id, speed in speeds.items():
        if speed > 5:
            vehicle_data = vehicle_data.append({
                "Track ID": track_id,
                "Vehicle No": plate_text,
                "Speed (km/hr)": speed
            }, ignore_index=True)
else:
    print("Speeds is not a dictionary. Please check the output of estimate_speed function.")

cap.release() video_writer.release() cv2.destroyAllWindows()

Save the vehicle data to an Excel file

vehicle_data.to_excel("vehicle_data.xlsx", index=False)

Example usage of preprocess_image and extract_text_from_image functions

image = cv2.imread('path_to_image.jpg')

text = extract_text_from_image(image)

print(f'Detected Text: {text}')

this one is running but not doing right ocr and not showing the speed and bounding box

PrakharJoshi54321 commented 5 months ago

is it necessary to wait for whole video to complete

pderrenger commented 5 months ago

Hello @PrakharJoshi54321,

Thank you for reaching out! To effectively address your question, it would be helpful to understand the specific context of your use case. However, I can provide some general guidance on handling video processing with YOLO models.

Real-Time Processing

If your goal is to process video frames in real-time, you do not need to wait for the entire video to complete. You can process each frame as it is read from the video stream. Here's a basic example of how you can achieve this:

import cv2
from ultralytics import YOLO

# Load the model
model = YOLO("yolov8n.pt")

# Path to the video file
video_path = 'video.mp4'  # Replace with your video file path

# Initialize video capture
cap = cv2.VideoCapture(video_path)
assert cap.isOpened(), "Error opening video file"

while cap.isOpened():
    success, frame = cap.read()
    if not success:
        break

    # Perform detection on the current frame
    results = model(frame)

    # Process results (e.g., draw bounding boxes)
    for result in results:
        for box in result.boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)

    # Display the frame with detections
    cv2.imshow('Frame', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Batch Processing

If you prefer to process the entire video at once, you can read all frames into memory, process them, and then save the results. This approach might be useful for post-processing tasks where real-time performance is not critical.

Importance of Reproducible Example

To provide more specific assistance, it would be helpful if you could share a minimum reproducible example of your code. This will allow us to better understand your setup and provide a more accurate solution. You can refer to our Minimum Reproducible Example Guide for more details.

Verify Package Versions

Please ensure you are using the latest versions of torch, ultralytics, and hub-sdk. You can update them using the following commands:

pip install --upgrade torch ultralytics hub-sdk

We hope this helps! If you have any further questions or need additional assistance, please let us know. The YOLO community and the Ultralytics team are here to support you! 😊

PrakharJoshi54321 commented 5 months ago

import cv2 from ultralytics import YOLO, solutions import pytesseract from PIL import Image import numpy as np import pandas as pd

Path to Tesseract executable pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

Load the models speed_model = YOLO("yolov8n.pt") # Model for speed detection and tracking plate_model = YOLO('epoch-68.pt') # Model for number plate detection

Path to the video file video_path = 'video.mp4' # Replace with your video file path

Initialize video capture cap = cv2.VideoCapture(video_path) assert cap.isOpened(), "Error opening video file"

w, h = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) fps = int(cap.get(cv2.CAP_PROP_FPS))

Video writer video_writer = cv2.VideoWriter("output_video.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

line_pts = [(0, h // 2), (w, h // 2)] # Update line points based on video resolution

Init speed-estimation object speed_obj = solutions.SpeedEstimator( reg_pts=line_pts, names=speed_model.model.names, view_img=True, )

DataFrame to store vehicle information vehicle_data = pd.DataFrame(columns=["Track ID", "Vehicle No", "Speed (km/hr)"])

def preprocess_image(image): gray = cv2.cvtColor(image, cv2.COLORBGR2GRAY) , thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY) return thresh

def extract_text_from_image(image): preprocessed_image = preprocess_image(image) pil_image = Image.fromarray(preprocessed_image) text = pytesseract.image_to_string(pil_image, config='--psm 8').strip() return text

while cap.isOpened(): success, im0 = cap.read() if not success: print("Error reading frame from video.") break

Speed detection and tracking

results = speed_model(im0)

if results: print(f"Tracks detected: {len(results)}") else: print("No tracks detected in this frame.")

Initialize plate_text to an empty string for each frame

plate_text = ""

Ensure tracks have valid data

for result in results: for box in result.boxes: x1, y1, x2, y2 = map(int, box.xyxy[0]) print(f"Vehicle detected at: {x1, y1, x2, y2}") cropped_image = im0[y1:y2, x1:x2]

    # Perform number plate detection
    plate_results = plate_model(cropped_image)

    for plate_result in plate_results:
        plate_boxes = plate_result.boxes.xyxy.numpy()
        if len(plate_boxes) == 0:
            print("No number plate detected in this vehicle bounding box.")
        for plate_box in plate_boxes:
            px1, py1, px2, py2 = map(int, plate_box)
            plate_cropped_image = cropped_image[py1:py2, px1:px2]

            # Extract text using OCR
            plate_text = extract_text_from_image(plate_cropped_image)
            print(f'Detected Number Plate: {plate_text}')

            # Draw the bounding box for the plate and add the text
            cv2.rectangle(im0, (x1 + px1, y1 + py1), (x1 + px2, y1 + py2), (0, 255, 0), 2)
            cv2.putText(im0, plate_text, (x1 + px1, y1 + py1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

Write the frame with detections and speed estimation

result = speed_obj.estimate_speed(im0, results) im0 = result[0] # Frame with detections speeds = result[1] # Speeds dictionary video_writer.write(im0)

Ensure speeds is a dictionary

if isinstance(speeds, dict):

Store vehicle information if speed exceeds 50 km/hr

for track_id, speed in speeds.items():
    if speed > 5:
        vehicle_data = vehicle_data.append({
            "Track ID": track_id,
            "Vehicle No": plate_text,
            "Speed (km/hr)": speed
        }, ignore_index=True)

else: print("Speeds is not a dictionary. Please check the output of estimate_speed function.") cap.release() video_writer.release() cv2.destroyAllWindows()

Save the vehicle data to an Excel file vehicle_data.to_excel("vehicle_data.xlsx", index=False)

Example usage of preprocess_image and extract_text_from_image functions

image = cv2.imread('path_to_image.jpg') text = extract_text_from_image(image) print(f'Detected Text: {text}') this one is running but not doing right ocr and not showing the speed and bounding box

pderrenger commented 5 months ago

Hello @PrakharJoshi54321,

Thank you for sharing your code and detailed explanation. Let's address the issues you're facing with OCR accuracy and speed estimation.

1. Importance of a Reproducible Example

To better assist you, it would be helpful to have a minimum reproducible example. This allows us to replicate the issue on our end and provide a more accurate solution. You can refer to our Minimum Reproducible Example Guide for more details.

2. Verify Package Versions

Please ensure you are using the latest versions of torch, ultralytics, and hub-sdk. You can update them using the following commands:

pip install --upgrade torch ultralytics hub-sdk

3. Improving OCR Accuracy

To improve OCR accuracy, consider additional preprocessing steps. Here's an enhanced version of your preprocess_image and extract_text_from_image functions:

def preprocess_image(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    return thresh

def extract_text_from_image(image):
    preprocessed_image = preprocess_image(image)
    pil_image = Image.fromarray(preprocessed_image)
    text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
    return text

4. Handling Speed Estimation

The estimate_speed function should return a tuple with the frame and a dictionary of speeds. Ensure you are correctly unpacking the result:

# Write the frame with detections and speed estimation
result = speed_obj.estimate_speed(im0, results)
im0, speeds = result  # Unpack the result
video_writer.write(im0)

# Ensure speeds is a dictionary
if isinstance(speeds, dict):
    # Store vehicle information if speed exceeds 50 km/hr
    for track_id, speed in speeds.items():
        if speed > 5:
            vehicle_data = vehicle_data.append({
                "Track ID": track_id,
                "Vehicle No": plate_text,
                "Speed (km/hr)": speed
            }, ignore_index=True)
else:
    print("Speeds is not a dictionary. Please check the output of estimate_speed function.")

5. Real-Time Processing

If you want to process video frames in real-time, you do not need to wait for the entire video to complete. You can process each frame as it is read from the video stream.

Example Code

Here’s a refined version of your script incorporating the above suggestions:

import cv2
from ultralytics import YOLO, solutions
import pytesseract
from PIL import Image
import numpy as np
import pandas as pd

# Path to Tesseract executable
pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'

# Load the models
speed_model = YOLO("yolov8n.pt")  # Model for speed detection and tracking
plate_model = YOLO('epoch-68.pt')  # Model for number plate detection

# Path to the video file
video_path = 'video.mp4'  # Replace with your video file path

# Initialize video capture
cap = cv2.VideoCapture(video_path)
assert cap.isOpened(), "Error opening video file"

w, h = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

# Video writer
video_writer = cv2.VideoWriter("output_video.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

line_pts = [(0, h // 2), (w, h // 2)]  # Update line points based on video resolution

# Init speed-estimation object
speed_obj = solutions.SpeedEstimator(
    reg_pts=line_pts,
    names=speed_model.model.names,
    view_img=True,
)

# DataFrame to store vehicle information
vehicle_data = pd.DataFrame(columns=["Track ID", "Vehicle No", "Speed (km/hr)"])

def preprocess_image(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    return thresh

def extract_text_from_image(image):
    preprocessed_image = preprocess_image(image)
    pil_image = Image.fromarray(preprocessed_image)
    text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
    return text

while cap.isOpened():
    success, im0 = cap.read()
    if not success:
        print("Error reading frame from video.")
        break

    # Speed detection and tracking
    results = speed_model(im0)

    if results:
        print(f"Tracks detected: {len(results)}")
    else:
        print("No tracks detected in this frame.")

    # Initialize plate_text to an empty string for each frame
    plate_text = ""

    # Ensure tracks have valid data
    for result in results:
        for box in result.boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            print(f"Vehicle detected at: {x1, y1, x2, y2}")
            cropped_image = im0[y1:y2, x1:x2]

            # Perform number plate detection
            plate_results = plate_model(cropped_image)

            for plate_result in plate_results:
                plate_boxes = plate_result.boxes.xyxy.numpy()
                if len(plate_boxes) == 0:
                    print("No number plate detected in this vehicle bounding box.")
                for plate_box in plate_boxes:
                    px1, py1, px2, py2 = map(int, plate_box)
                    plate_cropped_image = cropped_image[py1:py2, px1:px2]

                    # Extract text using OCR
                    plate_text = extract_text_from_image(plate_cropped_image)
                    print(f'Detected Number Plate: {plate_text}')

                    # Draw the bounding box for the plate and add the text
                    cv2.rectangle(im0, (x1 + px1, y1 + py1), (x1 + px2, y1 + py2), (0, 255, 0), 2)
                    cv2.putText(im0, plate_text, (x1 + px1, y1 + py1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Write the frame with detections and speed estimation
    result = speed_obj.estimate_speed(im0, results)
    im0, speeds = result  # Unpack the result
    video_writer.write(im0)

    # Ensure speeds is a dictionary
    if isinstance(speeds, dict):
        # Store vehicle information if speed exceeds 50 km/hr
        for track_id, speed in speeds.items():
            if speed > 5:
                vehicle_data = vehicle_data.append({
                    "Track ID": track_id,
                    "Vehicle No": plate_text,
                    "Speed (km/hr)": speed
                }, ignore_index=True)
    else:
        print("Speeds is not a dictionary. Please check the output of estimate_speed function.")

cap.release()
video_writer.release()
cv2.destroyAllWindows()

# Save the vehicle data to an Excel file
vehicle_data.to_excel("vehicle_data.xlsx", index=False)

We hope this helps! If you have any further questions or need additional assistance, please let us know. The YOLO community and the Ultralytics team are here to support you! 😊

PrakharJoshi54321 commented 5 months ago

0: 640x608 1 0, 124.8ms Speed: 0.0ms preprocess, 124.8ms inference, 0.0ms postprocess per image at shape (1, 3, 640, 608) Detected Number Plate: 5 Traceback (most recent call last): File "C:\Users\cairuser1\Desktop\project\intigrate.py", line 93, in im0, speeds = result # Unpack the result ValueError: too many values to unpack (expected 2)

pderrenger commented 5 months ago

Hello @PrakharJoshi54321,

Thank you for reaching out and providing details about the issue you're encountering. Let's work together to resolve this!

Importance of a Reproducible Example

To better understand and diagnose the problem, it would be extremely helpful if you could provide a minimum reproducible example of your code. This allows us to replicate the issue on our end and offer a more accurate solution. You can refer to our Minimum Reproducible Example Guide for more details on how to create one.

Verify Package Versions

Please ensure you are using the latest versions of torch, ultralytics, and hub-sdk. Sometimes, issues are resolved in newer releases, so updating your packages might help. You can update them using the following commands:

pip install --upgrade torch ultralytics hub-sdk

Handling the ValueError

The error message ValueError: too many values to unpack (expected 2) suggests that the estimate_speed function is returning more values than expected. Let's ensure that the function is correctly unpacking the results. Here’s a snippet to handle this:

# Write the frame with detections and speed estimation
result = speed_obj.estimate_speed(im0, results)
if len(result) == 2:
    im0, speeds = result  # Unpack the result
    video_writer.write(im0)

    # Ensure speeds is a dictionary
    if isinstance(speeds, dict):
        # Store vehicle information if speed exceeds 50 km/hr
        for track_id, speed in speeds.items():
            if speed > 5:
                vehicle_data = vehicle_data.append({
                    "Track ID": track_id,
                    "Vehicle No": plate_text,
                    "Speed (km/hr)": speed
                }, ignore_index=True)
    else:
        print("Speeds is not a dictionary. Please check the output of estimate_speed function.")
else:
    print("Unexpected number of values returned by estimate_speed function.")

Improving OCR Accuracy

To improve OCR accuracy, consider additional preprocessing steps. Here’s an enhanced version of your preprocess_image and extract_text_from_image functions:

def preprocess_image(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    return thresh

def extract_text_from_image(image):
    preprocessed_image = preprocess_image(image)
    pil_image = Image.fromarray(preprocessed_image)
    text = pytesseract.image_to_string(pil_image, config='--psm 8').strip()
    return text

We hope this helps resolve the issue. If you have any further questions or need additional assistance, please let us know. The YOLO community and the Ultralytics team are here to support you! 😊

PrakharJoshi54321 commented 5 months ago

i have already provided the .pt file please provide me the full folder of the project

pderrenger commented 5 months ago

Hello @PrakharJoshi54321,

Thank you for reaching out! We appreciate your interest in our project. To provide you with the best possible assistance, it would be extremely helpful if you could share a minimum reproducible example of your code. This will allow us to better understand the issue and offer a more accurate solution. You can refer to our Minimum Reproducible Example Guide for more details on how to create one.

Additionally, please ensure you are using the latest versions of torch, ultralytics, and hub-sdk. Sometimes, issues are resolved in newer releases, so updating your packages might help. You can update them using the following commands:

pip install --upgrade torch ultralytics hub-sdk

Regarding your request for the full folder of the project, we encourage users to build and customize their own projects based on the provided models and documentation. This approach allows for greater flexibility and understanding of the underlying processes.

If you have any specific questions or need further assistance with your code, feel free to share more details here. The YOLO community and the Ultralytics team are here to support you! 😊

PrakharJoshi54321 commented 4 months ago

import cv2 import numpy as np import pandas as pd import supervision as sv from tqdm import tqdm from ultralytics import YOLO from collections import defaultdict, deque import easyocr import logging import os

Setup logging

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

Configuration

CONFIDENCE_THRESHOLD = 0.5 IOU_THRESHOLD = 0.5 MODEL_NAME = "yolov8x.pt" NUMBER_PLATE_MODEL_NAME = "epoch-68.pt" MODEL_RESOLUTION = 1280 EXCEL_FILE_PATH = "speeding_vehicles.xlsx" SPEED_THRESHOLD = 50 # Set the speed threshold to 50 km/hr

Prompt for video file path

SOURCE_VIDEO_PATH = input("Enter the path to the video file (e.g., vehicles.mp4): ")

if not os.path.exists(SOURCE_VIDEO_PATH): logging.error("The provided video file path does not exist.") exit(1)

TARGET_VIDEOPATH = f"result{os.path.basename(SOURCE_VIDEO_PATH)}"

Source and Target ROIs (you might need to adjust these based on your video)

SOURCE = np.array([ [1252, 787], [2298, 803], [5039, 2159], [-550, 2159] ])

TARGET_WIDTH = 25 TARGET_HEIGHT = 250

TARGET = np.array([ [0, 0], [TARGET_WIDTH - 1, 0], [TARGET_WIDTH - 1, TARGET_HEIGHT - 1], [0, TARGET_HEIGHT - 1], ])

Initialize video processing

try: frame_generator = sv.get_video_frames_generator(source_path=SOURCE_VIDEO_PATH) frame_iterator = iter(frame_generator) frame = next(frame_iterator)

annotated_frame = frame.copy()
annotated_frame = sv.draw_polygon(scene=annotated_frame, polygon=SOURCE, color=sv.Color.RED, thickness=4)

# Transform Perspective
class ViewTransformer:
    def __init__(self, source: np.ndarray, target: np.ndarray) -> None:
        source = source.astype(np.float32)
        target = target.astype(np.float32)
        self.m = cv2.getPerspectiveTransform(source, target)

    def transform_points(self, points: np.ndarray) -> np.ndarray:
        if points.size == 0:
            return points
        reshaped_points = points.reshape(-1, 1, 2).astype(np.float32)
        transformed_points = cv2.perspectiveTransform(reshaped_points, self.m)
        return transformed_points.reshape(-1, 2)

view_transformer = ViewTransformer(source=SOURCE, target=TARGET)

# Initialize Models
speed_model = YOLO(MODEL_NAME)
number_plate_model = YOLO(NUMBER_PLATE_MODEL_NAME)

video_info = sv.VideoInfo.from_video_path(video_path=SOURCE_VIDEO_PATH)
frame_generator = sv.get_video_frames_generator(source_path=SOURCE_VIDEO_PATH)

# Tracer initiation
byte_track = sv.ByteTrack(frame_rate=video_info.fps, track_activation_threshold=CONFIDENCE_THRESHOLD)

# Annotators configuration
thickness = sv.calculate_optimal_line_thickness(resolution_wh=video_info.resolution_wh)
text_scale = sv.calculate_optimal_text_scale(resolution_wh=video_info.resolution_wh)
bounding_box_annotator = sv.BoundingBoxAnnotator(thickness=thickness)
label_annotator = sv.LabelAnnotator(text_scale=text_scale, text_thickness=thickness, text_position=sv.Position.BOTTOM_CENTER)
trace_annotator = sv.TraceAnnotator(thickness=thickness, trace_length=video_info.fps * 2, position=sv.Position.BOTTOM_CENTER)

polygon_zone = sv.PolygonZone(polygon=SOURCE, frame_resolution_wh=video_info.resolution_wh)

coordinates = defaultdict(lambda: deque(maxlen=video_info.fps))

# Initialize EasyOCR reader
reader = easyocr.Reader(['en'])

# Data storage
speeding_vehicles = []

# Open target video
with sv.VideoSink(TARGET_VIDEO_PATH, video_info) as sink:
    # Loop over source video frames
    for frame in tqdm(frame_generator, total=video_info.total_frames):
        original_height, original_width = frame.shape[:2]

        result = speed_model(frame, imgsz=MODEL_RESOLUTION, verbose=False)[0]
        detections = sv.Detections.from_ultralytics(result)

        # Log all detections before any filtering
        logging.info(f"All Detections: {detections}")

        # Filter out detections by class and confidence
        detections = detections[detections.confidence > CONFIDENCE_THRESHOLD]
        detections = detections[detections.class_id != 0]

        # Log detections after filtering by confidence and class
        logging.info(f"Filtered Detections: {detections}")

        # Filter out detections outside the zone
        detections = detections[polygon_zone.trigger(detections)]

        # Refine detections using non-max suppression
        detections = detections.with_nms(IOU_THRESHOLD)

        # Pass detection through the tracker
        detections = byte_track.update_with_detections(detections=detections)

        points = detections.get_anchors_coordinates(anchor=sv.Position.BOTTOM_CENTER)

        # Normalize points to the original resolution
        normalized_points = np.array([[x / MODEL_RESOLUTION * original_width, y / MODEL_RESOLUTION * original_height] for x, y in points])

        # Debug: Print normalized points and transformed points
        logging.info(f"Normalized Points: {normalized_points}")

        # Calculate the detections position inside the target RoI
        transformed_points = view_transformer.transform_points(points=normalized_points).astype(int)

        # Debug: Print transformed points
        logging.info(f"Transformed Points: {transformed_points}")

        # Store detections position
        for tracker_id, [_, y] in zip(detections.tracker_id, transformed_points):
            coordinates[tracker_id].append(y)

        # Format labels
        labels = []
        for tracker_id in detections.tracker_id:
            if len(coordinates[tracker_id]) < video_info.fps / 2:
                labels.append(f"#{tracker_id}")
            else:
                # Calculate speed
                coordinate_start = coordinates[tracker_id][-1]
                coordinate_end = coordinates[tracker_id][0]
                distance = abs(coordinate_start - coordinate_end)
                time = len(coordinates[tracker_id]) / video_info.fps
                speed = distance / time * 3.6
                labels.append(f"#{tracker_id} {int(speed)} km/h")

                if speed > SPEED_THRESHOLD:
                    # Detect number plate
                    number_plate_result = number_plate_model(frame, imgsz=MODEL_RESOLUTION, verbose=False)[0]
                    number_plate_detections = sv.Detections.from_ultralytics(number_plate_result)
                    number_plate_detections = number_plate_detections.with_nms(IOU_THRESHOLD)

                    for np_detection in number_plate_detections.xyxy:
                        x1, y1, x2, y2 = np_detection
                        number_plate_roi = frame[int(y1):int(y2), int(x1):int(x2)]
                        number_plate_text = reader.readtext(number_plate_roi, detail=0)

                        if number_plate_text:
                            speeding_vehicles.append({"tracker_id": tracker_id, "speed": int(speed), "number_plate": number_plate_text[0].strip()})

                            # Draw bounding box for number plate
                            cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
                            cv2.putText(frame, number_plate_text[0].strip(), (int(x1), int(y1) - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
                        else:
                            logging.warning(f"No text found for tracker_id {tracker_id}.")

        # Annotate frame
        annotated_frame = frame.copy()
        annotated_frame = trace_annotator.annotate(scene=annotated_frame, detections=detections)
        annotated_frame = bounding_box_annotator.annotate(scene=annotated_frame, detections=detections)
        annotated_frame = label_annotator.annotate(scene=annotated_frame, detections=detections, labels=labels)

        # Add frame to target video
        sink.write_frame(annotated_frame)

# Save data to Excel
if speeding_vehicles:
    df = pd.DataFrame(speeding_vehicles)
    df.to_excel(EXCEL_FILE_PATH, index=False)
    logging.info(f"Speeding vehicles data saved to {EXCEL_FILE_PATH}")

else:
    logging.info("No speeding vehicles detected.")

except Exception as e: logging.error(f"An error occurred: {str(e)}")

finally:

Clean up resources

sv.cleanup()

logging.shutdown()
PrakharJoshi54321 commented 4 months ago

import cv2 import numpy as np import pandas as pd import supervision as sv from tqdm import tqdm from ultralytics import YOLO from collections import defaultdict, deque import easyocr import logging import os

Setup logging

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

Configuration

CONFIDENCE_THRESHOLD = 0.5 IOU_THRESHOLD = 0.5 MODEL_NAME = "yolov8x.pt" NUMBER_PLATE_MODEL_NAME = "epoch-68.pt" MODEL_RESOLUTION = 1280 EXCEL_FILE_PATH = "speeding_vehicles.xlsx" SPEED_THRESHOLD = 50 # Set the speed threshold to 50 km/hr

Prompt for video file path

SOURCE_VIDEO_PATH = input("Enter the path to the video file (e.g., vehicles.mp4): ")

if not os.path.exists(SOURCE_VIDEO_PATH): logging.error("The provided video file path does not exist.") exit(1)

TARGET_VIDEOPATH = f"result{os.path.basename(SOURCE_VIDEO_PATH)}"

Source and Target ROIs (you might need to adjust these based on your video)

SOURCE = np.array([ [1252, 787], [2298, 803], [5039, 2159], [-550, 2159] ])

TARGET_WIDTH = 25 TARGET_HEIGHT = 250

TARGET = np.array([ [0, 0], [TARGET_WIDTH - 1, 0], [TARGET_WIDTH - 1, TARGET_HEIGHT - 1], [0, TARGET_HEIGHT - 1], ])

Initialize video processing

try: frame_generator = sv.get_video_frames_generator(source_path=SOURCE_VIDEO_PATH) frame_iterator = iter(frame_generator) frame = next(frame_iterator)

annotated_frame = frame.copy()
annotated_frame = sv.draw_polygon(scene=annotated_frame, polygon=SOURCE, color=sv.Color.RED, thickness=4)

# Transform Perspective
class ViewTransformer:
    def __init__(self, source: np.ndarray, target: np.ndarray) -> None:
        source = source.astype(np.float32)
        target = target.astype(np.float32)
        self.m = cv2.getPerspectiveTransform(source, target)

    def transform_points(self, points: np.ndarray) -> np.ndarray:
        if points.size == 0:
            return points
        reshaped_points = points.reshape(-1, 1, 2).astype(np.float32)
        transformed_points = cv2.perspectiveTransform(reshaped_points, self.m)
        return transformed_points.reshape(-1, 2)

view_transformer = ViewTransformer(source=SOURCE, target=TARGET)

# Initialize Models
speed_model = YOLO(MODEL_NAME)
number_plate_model = YOLO(NUMBER_PLATE_MODEL_NAME)

video_info = sv.VideoInfo.from_video_path(video_path=SOURCE_VIDEO_PATH)
frame_generator = sv.get_video_frames_generator(source_path=SOURCE_VIDEO_PATH)

# Tracer initiation
byte_track = sv.ByteTrack(frame_rate=video_info.fps, track_activation_threshold=CONFIDENCE_THRESHOLD)

# Annotators configuration
thickness = sv.calculate_optimal_line_thickness(resolution_wh=video_info.resolution_wh)
text_scale = sv.calculate_optimal_text_scale(resolution_wh=video_info.resolution_wh)
bounding_box_annotator = sv.BoundingBoxAnnotator(thickness=thickness)
label_annotator = sv.LabelAnnotator(text_scale=text_scale, text_thickness=thickness, text_position=sv.Position.BOTTOM_CENTER)
trace_annotator = sv.TraceAnnotator(thickness=thickness, trace_length=video_info.fps * 2, position=sv.Position.BOTTOM_CENTER)

polygon_zone = sv.PolygonZone(polygon=SOURCE, frame_resolution_wh=video_info.resolution_wh)

coordinates = defaultdict(lambda: deque(maxlen=video_info.fps))

# Initialize EasyOCR reader
reader = easyocr.Reader(['en'])

# Data storage
speeding_vehicles = []

# Open target video
with sv.VideoSink(TARGET_VIDEO_PATH, video_info) as sink:
    # Loop over source video frames
    for frame in tqdm(frame_generator, total=video_info.total_frames):
        original_height, original_width = frame.shape[:2]

        result = speed_model(frame, imgsz=MODEL_RESOLUTION, verbose=False)[0]
        detections = sv.Detections.from_ultralytics(result)

        # Log all detections before any filtering
        logging.info(f"All Detections: {detections}")

        # Filter out detections by class and confidence
        detections = detections[detections.confidence > CONFIDENCE_THRESHOLD]
        detections = detections[detections.class_id != 0]

        # Log detections after filtering by confidence and class
        logging.info(f"Filtered Detections: {detections}")

        # Filter out detections outside the zone
        detections = detections[polygon_zone.trigger(detections)]

        # Refine detections using non-max suppression
        detections = detections.with_nms(IOU_THRESHOLD)

        # Pass detection through the tracker
        detections = byte_track.update_with_detections(detections=detections)

        points = detections.get_anchors_coordinates(anchor=sv.Position.BOTTOM_CENTER)

        # Normalize points to the original resolution
        normalized_points = np.array([[x / MODEL_RESOLUTION * original_width, y / MODEL_RESOLUTION * original_height] for x, y in points])

        # Debug: Print normalized points and transformed points
        logging.info(f"Normalized Points: {normalized_points}")

        # Calculate the detections position inside the target RoI
        transformed_points = view_transformer.transform_points(points=normalized_points).astype(int)

        # Debug: Print transformed points
        logging.info(f"Transformed Points: {transformed_points}")

        # Store detections position
        for tracker_id, [_, y] in zip(detections.tracker_id, transformed_points):
            coordinates[tracker_id].append(y)

        # Format labels
        labels = []
        for tracker_id in detections.tracker_id:
            if len(coordinates[tracker_id]) < video_info.fps / 2:
                labels.append(f"#{tracker_id}")
            else:
                # Calculate speed
                coordinate_start = coordinates[tracker_id][-1]
                coordinate_end = coordinates[tracker_id][0]
                distance = abs(coordinate_start - coordinate_end)
                time = len(coordinates[tracker_id]) / video_info.fps
                speed = distance / time * 3.6
                labels.append(f"#{tracker_id} {int(speed)} km/h")

                if speed > SPEED_THRESHOLD:
                    # Detect number plate
                    number_plate_result = number_plate_model(frame, imgsz=MODEL_RESOLUTION, verbose=False)[0]
                    number_plate_detections = sv.Detections.from_ultralytics(number_plate_result)
                    number_plate_detections = number_plate_detections.with_nms(IOU_THRESHOLD)

                    for np_detection in number_plate_detections.xyxy:
                        x1, y1, x2, y2 = np_detection
                        number_plate_roi = frame[int(y1):int(y2), int(x1):int(x2)]
                        number_plate_text = reader.readtext(number_plate_roi, detail=0)

                        if number_plate_text:
                            speeding_vehicles.append({"tracker_id": tracker_id, "speed": int(speed), "number_plate": number_plate_text[0].strip()})

                            # Draw bounding box for number plate
                            cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
                            cv2.putText(frame, number_plate_text[0].strip(), (int(x1), int(y1) - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
                        else:
                            logging.warning(f"No text found for tracker_id {tracker_id}.")

        # Annotate frame
        annotated_frame = frame.copy()
        annotated_frame = trace_annotator.annotate(scene=annotated_frame, detections=detections)
        annotated_frame = bounding_box_annotator.annotate(scene=annotated_frame, detections=detections)
        annotated_frame = label_annotator.annotate(scene=annotated_frame, detections=detections, labels=labels)

        # Add frame to target video
        sink.write_frame(annotated_frame)

# Save data to Excel
if speeding_vehicles:
    df = pd.DataFrame(speeding_vehicles)
    df.to_excel(EXCEL_FILE_PATH, index=False)
    logging.info(f"Speeding vehicles data saved to {EXCEL_FILE_PATH}")

else:
    logging.info("No speeding vehicles detected.")

except Exception as e: logging.error(f"An error occurred: {str(e)}")

finally:

Ensure cleanup of resources

logging.shutdown()

2.mp4 has resolution of 3840 *2160, frame rate 25 and it is detecting the speed and numberplate properly this one is giving results like this ]), tracker_id=None, data={'class_name': array(['car', 'car', 'car', 'car', 'truck'], dtype='<U5')}) 2024-07-03 22:46:50,771 - INFO - Filtered Detections: Detections(xyxy=array([[ 2422.8, 977.47, 2581, 1101.4], [ 1410.5, 801.15, 1499, 862.52], [ 2137.2, 796.05, 2236.8, 877.79], [ 1585.7, 667.42, 1635.8, 708.55], [ 1466.2, 613.03, 1550.4, 719.94]], dtype=float32), mask=None, confidence=array([ 0.88346, 0.87683, 0.85949, 0.84482, 0.76997], dtype=float32), class_id=array([2, 2, 2, 2, 7]), tracker_id=None, data={'class_name': array(['car', 'car', 'car', 'car', 'truck'], dtype='<U5')}) 2024-07-03 22:46:50,771 - INFO - Normalized Points: [[ 7505.8 1858.6] [ 4364.2 1455.5] [ 6561.1 1481.3]] 2024-07-03 22:46:50,771 - INFO - Transformed Points: [[ 41 235] [ 29 205] [ 47 206]] 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 107/107 [00:29<00:00, 3.66it/s] 2024-07-03 22:46:51,023 - INFO - Speeding vehicles data saved to speeding_vehicles.xlsx

but another video 1.mp4 has resolution of 1280*720, frame rate 30 and it is not detecting the speed and numberplate it is giving results like this - 024-07-03 22:50:11,478 - INFO - All Detections: Detections(xyxy=array([[ 247.43, 570.35, 512.69, 717.66], [ 325.38, 466.82, 563.46, 654.06], [ 427.09, 264.71, 577.92, 399.99], [ 776.44, 444.43, 1010.3, 632.11], [ 718.26, 389.35, 903.3, 532.73], [ 348.89, 378.71, 538.21, 513.57], [ 785.41, 627.14, 1061.5, 718.41], [ 112.22, 206.52, 207.1, 280.28], [ 0.27941, 228.17, 98.349, 346.19], [ 729.38, 227.62, 941.39, 444.42], [ 405.64, 335.83, 541.96, 433.88], [ 870.27, 216.03, 962.63, 297.65], [ 196.72, 195.1, 261.85, 255.37], [ 708.65, 234.33, 764.89, 324.18], [ 143.25, 199, 218.81, 268.68], [ 243.09, 194.55, 305.4, 246.48], [ 508.81, 206.41, 593.96, 285.57], [ 691.46, 210.53, 772.8, 282.73], [ 0.20903, 204.03, 37.905, 231.13], [ 301.99, 199.16, 348.34, 239.21], [ 837.92, 656.13, 904.92, 707.41], [ 536.38, 144.38, 632.02, 266.03], [ 530.02, 199.7, 606.45, 271.52], [ 469.02, 244.58, 576.86, 303.3], [ 460.98, 225.87, 567.75, 269.99], [ 671.09, 172.69, 755.08, 260.34]], dtype=float32), mask=None, confidence=array([ 0.95283, 0.93721, 0.92874, 0.928, 0.9131, 0.90814, 0.90619, 0.9044, 0.89841, 0.89789, 0.8941, 0.86631, 0.83284, 0.80641, 0.78491, 0.73674, 0.73258, 0.63325, 0.56025, 0.54639, 0.49325, 0.48808, 0.43113, 0.36193, 0.35472, 0.26218], dtype=float32), class_id=array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 7, 2, 2, 2, 2]), tracker_id=None, data={'class_name': array(['car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'person', 'truck', 'car', 'car', 'car', 'car'], dtype='<U6')}) 2024-07-03 22:50:11,478 - INFO - Filtered Detections: Detections(xyxy=array([[ 247.43, 570.35, 512.69, 717.66], [ 325.38, 466.82, 563.46, 654.06], [ 427.09, 264.71, 577.92, 399.99], [ 776.44, 444.43, 1010.3, 632.11], [ 718.26, 389.35, 903.3, 532.73], [ 348.89, 378.71, 538.21, 513.57], [ 785.41, 627.14, 1061.5, 718.41], [ 112.22, 206.52, 207.1, 280.28], [ 0.27941, 228.17, 98.349, 346.19], [ 729.38, 227.62, 941.39, 444.42], [ 405.64, 335.83, 541.96, 433.88], [ 870.27, 216.03, 962.63, 297.65], [ 196.72, 195.1, 261.85, 255.37], [ 708.65, 234.33, 764.89, 324.18], [ 143.25, 199, 218.81, 268.68], [ 243.09, 194.55, 305.4, 246.48], [ 508.81, 206.41, 593.96, 285.57], [ 691.46, 210.53, 772.8, 282.73], [ 0.20903, 204.03, 37.905, 231.13], [ 301.99, 199.16, 348.34, 239.21]], dtype=float32), mask=None, confidence=array([ 0.95283, 0.93721, 0.92874, 0.928, 0.9131, 0.90814, 0.90619, 0.9044, 0.89841, 0.89789, 0.8941, 0.86631, 0.83284, 0.80641, 0.78491, 0.73674, 0.73258, 0.63325, 0.56025, 0.54639], dtype=float32), class_id=array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]), tracker_id=None, data={'class_name': array(['car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car'], dtype='<U6')}) 2024-07-03 22:50:11,478 - INFO - Normalized Points: [] 2024-07-03 22:50:11,478 - INFO - Transformed Points: [] 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 305/306 [00:38<00:00, 9.15it/s]2024-07-03 22:50:11,595 - INFO - All Detections: Detections(xyxy=array([[ 245.77, 573.39, 512.11, 717.74], [ 324.3, 468.73, 563.41, 658.62], [ 777.09, 446.95, 1011.7, 634.13], [ 426.77, 266.77, 578.22, 400.58], [ 718.29, 390.87, 901.12, 533.96], [ 347.49, 380.33, 537.89, 515.3], [ 113.4, 206.93, 208.39, 279.93], [ 405.18, 336.22, 541.66, 435.47], [ 786.09, 631.33, 1063.1, 717.89], [ 729.44, 228.81, 941.94, 446.08], [ 870.9, 212.28, 962.69, 298.04], [ 197.61, 195.13, 262.76, 255.2], [ 707.59, 235.35, 767.69, 324.69], [ 0.27886, 227.35, 99.845, 344.89], [ 243.97, 194.03, 306.83, 246.59], [ 144.97, 199.38, 219.88, 267.91], [ 508.17, 206.84, 592.81, 286.29], [ 691.88, 210.77, 773.45, 282.91], [ 0.20464, 203.87, 39.899, 230.11], [ 533.53, 144.28, 631.6, 266.56], [ 303.5, 200.21, 349.09, 238.58], [ 844.36, 660.23, 909.59, 709.73], [ 468.38, 245.46, 577.14, 306.63], [ 532.1, 144.65, 631.6, 267.35], [ 460.96, 226.94, 567.96, 273.12], [ 530.99, 199.89, 605.41, 271.32], [ 671.87, 176.23, 755.12, 260.54], [ 293.21, 595.13, 365.69, 643.76]], dtype=float32), mask=None, confidence=array([ 0.95099, 0.93088, 0.92644, 0.92561, 0.91557, 0.91154, 0.90844, 0.9024, 0.89078, 0.8701, 0.86753, 0.83458, 0.81764, 0.78634, 0.77181, 0.75982, 0.73374, 0.6944, 0.58112, 0.45234, 0.40249, 0.40003, 0.39438, 0.37744, 0.37395, 0.3629, 0.3297, 0.27324], dtype=float32), class_id=array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 7, 2, 0, 2, 5, 2, 2, 2, 0]), tracker_id=None, data={'class_name': array(['car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'truck', 'car', 'person', 'car', 'bus', 'car', 'car', 'car', 'person'], dtype='<U6')}) 2024-07-03 22:50:11,595 - INFO - Filtered Detections: Detections(xyxy=array([[ 245.77, 573.39, 512.11, 717.74], [ 324.3, 468.73, 563.41, 658.62], [ 777.09, 446.95, 1011.7, 634.13], [ 426.77, 266.77, 578.22, 400.58], [ 718.29, 390.87, 901.12, 533.96], [ 347.49, 380.33, 537.89, 515.3], [ 113.4, 206.93, 208.39, 279.93], [ 405.18, 336.22, 541.66, 435.47], [ 786.09, 631.33, 1063.1, 717.89], [ 729.44, 228.81, 941.94, 446.08], [ 870.9, 212.28, 962.69, 298.04], [ 197.61, 195.13, 262.76, 255.2], [ 707.59, 235.35, 767.69, 324.69], [ 0.27886, 227.35, 99.845, 344.89], [ 243.97, 194.03, 306.83, 246.59], [ 144.97, 199.38, 219.88, 267.91], [ 508.17, 206.84, 592.81, 286.29], [ 691.88, 210.77, 773.45, 282.91], [ 0.20464, 203.87, 39.899, 230.11]], dtype=float32), mask=None, confidence=array([ 0.95099, 0.93088, 0.92644, 0.92561, 0.91557, 0.91154, 0.90844, 0.9024, 0.89078, 0.8701, 0.86753, 0.83458, 0.81764, 0.78634, 0.77181, 0.75982, 0.73374, 0.6944, 0.58112], dtype=float32), class_id=array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]), tracker_id=None, data={'class_name': array(['car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car', 'car'], dtype='<U6')}) 2024-07-03 22:50:11,595 - INFO - Normalized Points: [] 2024-07-03 22:50:11,595 - INFO - Transformed Points: [] 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 306/306 [00:38<00:00, 7.99it/s] 2024-07-03 22:50:11,609 - INFO - No speeding vehicles detected.

pderrenger commented 4 months ago

Hello @PrakharJoshi54321,

Thank you for sharing your detailed code and observations. It’s great to see the effort you’ve put into your project! Let's address the issues you're encountering with different video resolutions and frame rates.

Importance of a Reproducible Example

To better diagnose and resolve the issue, it would be extremely helpful if you could provide a minimum reproducible example. This allows us to replicate the issue on our end and offer a more accurate solution. You can refer to our Minimum Reproducible Example Guide for more details on how to create one.

Verify Package Versions

Please ensure you are using the latest versions of torch, ultralytics, and hub-sdk. Sometimes, issues are resolved in newer releases, so updating your packages might help. You can update them using the following commands:

pip install --upgrade torch ultralytics hub-sdk

Addressing the Issue

It appears that the difference in video resolution and frame rate might be affecting the detection and tracking performance. Here are a few suggestions to help improve the consistency of your results:

  1. Normalization of Points: Ensure that the points are correctly normalized to the original resolution. This step is crucial for accurate speed estimation and bounding box drawing.

  2. Adjusting Model Resolution: The model resolution (MODEL_RESOLUTION) might need to be adjusted based on the input video resolution. For lower resolution videos, you might want to reduce the model resolution to avoid losing details.

  3. Debugging and Logging: Continue to use logging to debug and understand the behavior of your detections. This will help identify any discrepancies between different video inputs.

Example Code Adjustments

Here’s an example of how you might adjust the normalization and logging to better handle different video resolutions:

# Normalize points to the original resolution
normalized_points = np.array([[x / MODEL_RESOLUTION * original_width, y / MODEL_RESOLUTION * original_height] for x, y in points])

# Debug: Print normalized points and transformed points
logging.info(f"Normalized Points: {normalized_points}")

# Calculate the detections position inside the target RoI
transformed_points = view_transformer.transform_points(points=normalized_points).astype(int)

# Debug: Print transformed points
logging.info(f"Transformed Points: {transformed_points}")

# Store detections position
for tracker_id, [_, y] in zip(detections.tracker_id, transformed_points):
    coordinates[tracker_id].append(y)

Handling Different Video Resolutions

Ensure that your code dynamically adjusts to different video resolutions. Here’s an example of how you might handle this:

# Adjust model resolution based on input video resolution
if original_width <= 1280:
    MODEL_RESOLUTION = 640
elif original_width <= 1920:
    MODEL_RESOLUTION = 960
else:
    MODEL_RESOLUTION = 1280

# Initialize Models with adjusted resolution
speed_model = YOLO(MODEL_NAME)
number_plate_model = YOLO(NUMBER_PLATE_MODEL_NAME)

Conclusion

We hope these suggestions help improve the consistency of your results across different video resolutions and frame rates. If you have any further questions or need additional assistance, please let us know. The YOLO community and the Ultralytics team are here to support you! 😊

github-actions[bot] commented 3 months ago

πŸ‘‹ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO πŸš€ and Vision AI ⭐