python_image API vs Daknet Detector

JayantGoel001 commented 2 years ago

Hello @AlexeyAB @cenit @elias-work @agjunyent I was trying to run yolov4-csp-swish model in colab using python API and Darknet Detector. In This, I saw that accuracy of python API is not as good as Darknet Detector this is the image I used for inferencing.

The darknet detector gave me these results:

the command I ran: !./darknet detector test cfg/coco.data cfg/yolov4-csp-swish.cfg yolov4-csp-swish.weights ../1.png --dont_show --thresh 0.35

With the same model and everything same, Python API gave me these results:

The command i ran : !python darknet_images.py --input ../1.png --batch_size=12 --weights yolov4-csp-swish.weights --dont_show --config_file cfg/yolov4-csp-swish.cfg --save_labels --ext_output --thresh 0.35

Can You please fix this issue or is there something I am doing wrong?

haviduck commented 2 years ago

is cuda and cudnn utilized on py?

JayantGoel001 commented 2 years ago

is cuda and cudnn utilized on py?

Hello @haviduck I think Yes because we build it using the "make" command and There is also no decrement in Speed.

akashAD98 commented 2 years ago

@JayantGoel001 can you please share the source code for inferencing,I'm using yolov4-CSP,mish weights.Thanks in advance

ghm666 commented 2 years ago

In the code, I think the input image of the two methods is different.

JayantGoel001 commented 2 years ago

@JayantGoel001 can you please share the source code for inferencing,I'm using yolov4-CSP,mish weights.Thanks in advance

Hey @akashAD98 Here is the code for inferencing using cmd or Jupyter or colab https://colab.research.google.com/drive/12QusaaRj_lUwCGDvQNfICpa7kA7_a2dE

And here is the code for python API https://github.com/AlexeyAB/darknet/blob/master/darknet_images.py

JayantGoel001 commented 2 years ago

In the code, I think the input image of the two methods is different.

Hi @ghm666 Actually No, Input Image is the same Try with the above image and find the difference.

akashAD98 commented 2 years ago

@JayantGoel001 Thanks Jayant! For Python API I need to install darknet right (should i do inference using this ??) ? have you tried OpenCV-python code for yolov4x-mish/CSP ?? I want to do inference using the scaled yolov4 model.

JayantGoel001 commented 2 years ago

@JayantGoel001 for python API I need to install darknet right ? have you tried OpenCV-python code for yolov4x-mish/csp ??

Yes, @akashAD98 You Would need to install darknet .(Also set libso parameter to 1). https://github.com/AlexeyAB/darknet/issues/7854 Well, I tried yolov4-csp with OpenCV dnn method but NMS doesn't seem to work with mish . So OpenCV won't work with yolov4x-mish but I am not sure

akashAD98 commented 2 years ago

@JayantGoel001 is there any other method to do inference using python script ?? can you share that opencv-yolov4 inference script, so I will try.

JayantGoel001 commented 2 years ago

@JayantGoel001 Thanks Jayant! For Python API I need to install darknet right (should i do inference using this ??) ? have you tried OpenCV-python code for yolov4x-mish/CSP ?? I want to do inference using the scaled yolov4 model.

Well, It depends on what you want to do. You can use cmd or colab notebook for inferencing but its inferencing speed for one image to predict will be 5 seconds. but if you modify some code of the .py file you can build network once use it for multiple images and better speed.

JayantGoel001 commented 2 years ago

@JayantGoel001 is there any other method to do inference using python script ??

Another method is to use this but its accuracy is somewhat lower than actual once. https://github.com/hhk7734/tensorflow-yolov4

JayantGoel001 commented 2 years ago

@JayantGoel001 is there any other method to do inference using python script ?? can you share that opencv-yolov4 inference script, so I will try.

Sorry, but I don't have one right now. But you can try this one https://gist.github.com/YashasSamaga/e2b19a6807a13046e399f4bc3cca3a49

akashAD98 commented 2 years ago

@

@JayantGoel001 is there any other method to do inference using python script ??

Another method is to use this but its accuracy is somewhat lower than actual once. https://github.com/hhk7734/tensorflow-yolov4

scaled yolov4 is not supporting, right? at the time of converting yolo.weight to tensorflow.

JayantGoel001 commented 2 years ago

@

@JayantGoel001 is there any other method to do inference using python script ??

Another method is to use this but its accuracy is somewhat lower than actual once. https://github.com/hhk7734/tensorflow-yolov4

scaled yolov4 is not supporting, right? at the time of converting yolo.weight to tensorflow.

You don;t need to convert weights to tf This python package takes yolov4-csp weights only.

akashAD98 commented 2 years ago

@JayantGoel001 Ohh that's great ! Thanks a lot for your kind help. I will try it & update to you

akashAD98 commented 2 years ago

@JayantGoel001 is there any other method to do inference using python script ?? can you share that opencv-yolov4 inference script, so I will try.

Sorry, but I don't have one right now. But you can try this one https://gist.github.com/YashasSamaga/e2b19a6807a13046e399f4bc3cca3a49

@JayantGoel001

error: OpenCV(4.1.2) /io/opencv/modules/dnn/src/darknet/darknet_io.cpp:554: error: (-212:Parsing error) Unsupported activation: mish in function 'ReadDarknetFromCfgStream'

im using pretrained yolov4.cfg & weight , still getting this issue. (yolov4.cfg dont have mish activation function ) which OpenCV version are you using?

JayantGoel001 commented 2 years ago

@JayantGoel001 is there any other method to do inference using python script ?? can you share that opencv-yolov4 inference script, so I will try.

Sorry, but I don't have one right now. But you can try this one https://gist.github.com/YashasSamaga/e2b19a6807a13046e399f4bc3cca3a49

@JayantGoel001

error: OpenCV(4.1.2) /io/opencv/modules/dnn/src/darknet/darknet_io.cpp:554: error: (-212:Parsing error) Unsupported activation: mish in function 'ReadDarknetFromCfgStream'

im using pretrained yolov4.cfg & weight , still getting this issue. (yolov4.cfg dont have mish activation function ) which OpenCV version are you using?

Can You please share the code? I might be able to resolve the issue.

akashAD98 commented 2 years ago

its the same code which you shared with me .just changed cfg & weight (im running this on google collab) yolov4_opencv.txt

JayantGoel001 commented 2 years ago

its the same code which you shared with me .just changed cfg & weight (im running this on google collab) yolov4_opencv.txt

Check The CFG & weights file once because I think that's the issue of MISH activation function which is used by Scaled Yolov4 Or try this once https://github.com/chineseocr/opencv-for-darknet It Might be due to OpenCV. My recommendation for you is that you should run it in on colab or Linux instead of windows for easiness.

akashAD98 commented 2 years ago

I tried this https://github.com/hhk7734/tensorflow-yolov4 but don't know how to check inference.

import cv2

from yolov4.tf import YOLOv4

yolo = YOLOv4()

yolo.config.parse_names("/content/drive/MyDrive/yolov4/coco.names") yolo.config.parse_cfg("/content/drive/MyDrive/yolov4/yolov4x-mish.cfg")

yolo.make_model() yolo.load_weights("/content/drive/MyDrive/yolov4/yolov4x-mish.weights", weights_type="yolo") yolo.summary(summary_type="yolo") yolo.summary()

yolo.inference(media_path="/content/tensorflow-yolov4/test/kite.jpg")

yolo.inference(media_path="/content/tensorflow-yolov4/test/road.mp4", is_image=False)

yolo.inference( "/content/drive/MyDrive/yolov4", is_image=False, cv_apiPreference=cv2.CAP_V4L2, cv_frame_size=(640, 480), cv_fourcc="YUYV", )

I'm getting yolo layers summary. but how to get inferencing. can you please help me here? My google collab notebook. https://colab.research.google.com/drive/1YzivoL_JV-xlWTid6j9Zbsw67xtr7Lqk?usp=sharing

@JayantGoel001

JayantGoel001 commented 2 years ago

I tried this https://github.com/hhk7734/tensorflow-yolov4 but don't know how to check inference.

import cv2

from yolov4.tf import YOLOv4

yolo = YOLOv4()

yolo.config.parse_names("/content/drive/MyDrive/yolov4/coco.names") yolo.config.parse_cfg("/content/drive/MyDrive/yolov4/yolov4x-mish.cfg")

yolo.make_model() yolo.load_weights("/content/drive/MyDrive/yolov4/yolov4x-mish.weights", weights_type="yolo") yolo.summary(summary_type="yolo") yolo.summary()

yolo.inference(media_path="/content/tensorflow-yolov4/test/kite.jpg")

yolo.inference(media_path="/content/tensorflow-yolov4/test/road.mp4", is_image=False)

yolo.inference( "/content/drive/MyDrive/yolov4", is_image=False, cv_apiPreference=cv2.CAP_V4L2, cv_frame_size=(640, 480), cv_fourcc="YUYV", )

I'm getting yolo layers summary. but how to get inferencing. can you please help me here? My google collab notebook. https://colab.research.google.com/drive/1YzivoL_JV-xlWTid6j9Zbsw67xtr7Lqk?usp=sharing

@JayantGoel001

Use this to inference

# media_path ---> type = string
#                 Inputing full image path
# Threshold ---> type = float
#                Inputing a particular threshold .
def inference(media_path,is_image: bool = True,cv_apiPreference=None,cv_frame_size: tuple = None,cv_fourcc: str = None,cv_waitKey_delay: int = 1,prob_thresh=0.5):
    if is_image:
        # Reading and decoding the image 
        frame = cv2.imread(media_path)
        frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

        # Predicting the output of yolov4 model.
        start_time = time.time()
        bboxes = yolo.predict(frame,prob_thresh=prob_thresh)
        json_format = {}

        # Creating The Json Format Output.
        for box in bboxes:
            if names[int(box[4])] not in json_format:
                json_format[names[int(box[4])]] = []

            json_format[names[int(box[4])]].append({
                "Coordinates":list(box[:4]),
                "Probability":box[5]
                })

        # Saving The JSON format
        with open(media_path.split(".")[0]+".json",'w+') as f:
            f.write(str(json_format))
        print(json_format)
        exec_time = time.time() - start_time
        print("time: {:.2f} ms".format(exec_time * 1000))

        # Saving the Image
        frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        # Drawing The Box over iamge.
        image = yolo.draw_bboxes(frame, bboxes)
        cv2.imwrite("result"+media_path, image)

akashAD98 commented 2 years ago

@JayantGoel001
im getting this error should i need to pass coco.names or names=[person,cat,dog....] all coco names ] in list

JayantGoel001 commented 2 years ago

@JayantGoel001 im getting this error

Use this code to get the names of all object

# Loading Names of Classes in An array.
names = []
with open("coco.names",'r+') as f:
    names = f.readlines()

for i in range(len(names)):
    names[i] = names[i][:-1]

print(names)

akashAD98 commented 2 years ago

@JayantGoel001 Thanks it works perfectly fine for single image, But when im trying this for video its giving me error. error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor

haviduck commented 2 years ago

@JayantGoel001 Thanks it works perfectly fine for single image, But when im trying this for video its giving me error. error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor

itll throw that on corrupt / blank frames. usually the very first frame does this. you can do something like
if ret is None: continue

akashAD98 commented 2 years ago

def inference_video(media_path,is_image: bool = True,cv_apiPreference=None,cv_frame_size: tuple = None,cv_fourcc: str = None,cv_waitKey_delay: int = 1,prob_thresh=0.5):

`if is_image:`
    `video = cv2.VideoCapture(media_path)

    # We need to check if camera
    # is opened previously or not
    if (video.isOpened() == False): 
        print("Error reading video file")

    # We need to set resolutions.
    # so, convert them from float to integer.
    frame_width = int(video.get(3))
    frame_height = int(video.get(4))

    size = (frame_width, frame_height)
    result = cv2.VideoWriter('filename.avi', 
                     cv2.VideoWriter_fourcc(*'MJPG'),
                     10, size)
    while(True):
      ret, frame = video.read()
      if ret == True:

        start_time = time.time()
        bboxes = yolo.predict(video,prob_thresh=prob_thresh)
        json_format = {}

        # Creating The Json Format Output.
        for box in bboxes:
          if names[int(box[4])] not in json_format:
                json_format[names[int(box[4])]] = []

          json_format[names[int(box[4])]].append({
                "Coordinates":list(box[:4]),
                "Probability":box[5]
                })

      # Saving The JSON format
        with open(media_path.split(".")[0]+".json",'w+') as f:
            f.write(str(json_format))
        print(json_format)
        exec_time = time.time() - start_time
        print("time: {:.2f} ms".format(exec_time * 1000))

        image = yolo.draw_bboxes(video, bboxes)
        result.write(image)
        cv2_imshow(image)
        if cv2.waitKey(1) & 0xFF == ord('s'):
          break

    video.release()
    result.release()

    cv2.destroyAllWindows()

    print("The video was successfully saved")

`

This is how i was trying to do but im getting this error @JayantGoel001 @haviduck My google collab notebook : https://colab.research.google.com/drive/1pvOTA847PdmWVzNWQ9FGgWFQE39zNZ3T?usp=sharing

`AttributeError Traceback (most recent call last)

in () ----> 1 inference_video("/content/tensorflow-yolov4/test/road.mp4",prob_thresh=0.50) 1 frames /usr/local/lib/python3.7/dist-packages/yolov4/tf/__init__.py in predict(self, frame, prob_thresh) 114 """ 115 # image_data == Dim(1, input_size[1], input_size[0], channels) --> 116 height, width, _ = frame.shape 117 118 image_data = self.resize_image(frame) AttributeError: 'cv2.VideoCapture' object has no attribute 'shape'` This is one last query I have, Hope you can help me. Thanks again for your help.

haviduck commented 2 years ago

yeah you are checking the videocapture object and not each frame. put your frame check inside your while, change to use frame and while at it, skip the if and do while vid.is_opened() :)

akashAD98 commented 2 years ago

@haviduck ok got it. Thanks

soltkreig commented 2 years ago

Did you have a reason why output is different? I got the same trouble

JayantGoel001 commented 2 years ago

Did you have a reason why the output is different? I got the same trouble

Hey @soltkreig No, I haven't been able to figure it out.

haviduck commented 2 years ago

remove all the clutter and get the basics working before you add recording etc and this is more of a opencv situation than darknet. something like this:


def inference_video(media_path, prob_thresh=0.5):`

    video = cv2.VideoCapture(media_path)
    result = cv2.VideoWriter("filename.avi", cv2.VideoWriter_fourcc(*'MJPG'), 10)
    while video.isOpened():
        ret, frame = video.read()
        if ret is None:
            break
        # frame_width = video.get(cv2.CAP_PROP_FRAME_WIDTH)
        # frame_height = video.get(cv2.CAP_PROP_FRAME_HEIGHT)
        # size = (frame_width, frame_height)
        # start_time = time.time()
        bboxes = yolo.predict(video, prob_thresh=prob_thresh)
        image = yolo.draw_bboxes(video, bboxes)
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
        result.write(image)
        cv2.imshow(image)

        if cv2.waitKey(1) & 0xFF == ord('s'):
            break

    video.release()
    result.release()
    cv2.destroyAllWindows()

    print("The video was successfully saved")

akashAD98 commented 2 years ago

remove all the clutter and get the basics working before you add recording etc and this is more of a opencv situation than darknet. something like this:

def inference_video(media_path, prob_thresh=0.5):`

    video = cv2.VideoCapture(media_path)
    result = cv2.VideoWriter("filename.avi", cv2.VideoWriter_fourcc(*'MJPG'), 10)
    while video.isOpened():
        ret, frame = video.read()
        if ret is None:
            break
        # frame_width = video.get(cv2.CAP_PROP_FRAME_WIDTH)
        # frame_height = video.get(cv2.CAP_PROP_FRAME_HEIGHT)
        # size = (frame_width, frame_height)
        # start_time = time.time()
        bboxes = yolo.predict(video, prob_thresh=prob_thresh)
        image = yolo.draw_bboxes(video, bboxes)
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
        result.write(image)
        cv2.imshow(image)

        if cv2.waitKey(1) & 0xFF == ord('s'):
            break

    video.release()
    result.release()
    cv2.destroyAllWindows()

    print("The video was successfully saved")

i tried this & Still im getting the same error AttributeError: 'cv2.VideoCapture' object has no attribute 'shape'

please check my google collab https://colab.research.google.com/drive/1pvOTA847PdmWVzNWQ9FGgWFQE39zNZ3T?usp=sharing

JayantGoel001 commented 2 years ago

remove all the clutter and get the basics working before you add recording etc and this is more of a opencv situation than darknet. something like this:
def inference_video(media_path, prob_thresh=0.5):`

    video = cv2.VideoCapture(media_path)
    result = cv2.VideoWriter("filename.avi", cv2.VideoWriter_fourcc(*'MJPG'), 10)
    while video.isOpened():
        ret, frame = video.read()
        if ret is None:
            break
        # frame_width = video.get(cv2.CAP_PROP_FRAME_WIDTH)
        # frame_height = video.get(cv2.CAP_PROP_FRAME_HEIGHT)
        # size = (frame_width, frame_height)
        # start_time = time.time()
        bboxes = yolo.predict(video, prob_thresh=prob_thresh)
        image = yolo.draw_bboxes(video, bboxes)
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
        result.write(image)
        cv2.imshow(image)

        if cv2.waitKey(1) & 0xFF == ord('s'):
            break

    video.release()
    result.release()
    cv2.destroyAllWindows()

    print("The video was successfully saved")
i tried this & Still im getting the same error AttributeError: 'cv2.VideoCapture' object has no attribute 'shape'

please check my google collab https://colab.research.google.com/drive/1pvOTA847PdmWVzNWQ9FGgWFQE39zNZ3T?usp=sharing

Line Number 32 & 33 should be like this bboxes = yolo.predict(frame,prob_thresh=prob_thresh) image = yolo.draw_bboxes(frame, bboxes)

Now video consists of many frames these frames are images that you need to pass in the predict function.

akashAD98 commented 2 years ago

here its final script for doing inferencing on image+ video (scaled yolo- yolov4x-mish/csp ) https://colab.research.google.com/drive/1pvOTA847PdmWVzNWQ9FGgWFQE39zNZ3T?usp=sharing

haviduck commented 2 years ago

by the looks of it you should comment out rgb2bgr to maintain colors

akashAD98 commented 2 years ago

@haviduck @JayantGoel001 can we use same script for object tracking? i.e deep sort? has anybody tried this?

JayantGoel001 commented 2 years ago

@haviduck @JayantGoel001 can we use same script for object tracking? i.e deep sort? has anybody tried this?

Hello @akashAD98 I think you can use it for object tracking, if i am not wrong object tracking means checking whether in a frame object is present or not

If it is then I guess yes you just need to manipulate the result which you get as json file in whatever format or use case in which your requirements are.

haviduck commented 2 years ago

you can implement object tracking for sure. Deep sort requires you to run it through tensorflow (or pytorch). while SORT the OG sort repo is pretty much plug and play. you send it a list of detections, and it returns with tracklists appended with ids. i use it all the time and its cool unless you need some heavy re-identification and stuff like that.

akashAD98 commented 2 years ago

@haviduck it's possible for you to share the script of object tracking using this mish weight?Thanks for your help

haviduck commented 2 years ago

@haviduck it's possible for you to share the script of object tracking using this mish weight?Thanks for your help

if you go for SORT its really quite straight forward: https://github.com/abewley/sort

from sort import *

#create instance of SORT
mot_tracker = Sort() 

# get detections
...

# update SORT
track_bbs_ids = mot_tracker.update(detections)

# track_bbs_ids is a np array where each row contains a valid bounding box and track_id (last column)

Only change ive done is to rewrite to to return classid and some typecasting. other than that this is pretty much what most of the trackers look like.

Norfair has a v4 demo. that one is also pretty easy to implement. if you try yourself and reply if you get stuck maybe someone will help out but my implementationcode wont fit you, and im not gonna write it for you, sorry. but hope this helps some :)

akashAD98 commented 2 years ago

@haviduck actually i was planning to add my this https://colab.research.google.com/drive/1pvOTA847PdmWVzNWQ9FGgWFQE39zNZ3T?usp=sharing script

(but im not able to extract these 4 (boxes, scores, classes, nums ) values from this yolo.predict)

to this deppsort tracker code https://github.com/emasterclassacademy/Single-Multiple-Custom-Object-Detection-and-Tracking.git

can someone help me to extract this values & pass it to tracker

.

AlexeyAB / darknet

python_image API vs Daknet Detector #7938