openvinotoolkit / openvino_notebooks

📚 Jupyter notebook tutorials for OpenVINO™
Apache License 2.0
2.28k stars 790 forks source link

Object removed does not work in automated checkout #1754

Closed Abhijeet241093 closed 1 week ago

Abhijeet241093 commented 6 months ago

Describe the bug

If we define zone from capture frame of top camera of vending machine in that case, object is placed outside of one zone boundary. The scenario look like below : Screenshot from 2024-02-26 14-36-43

In this case, object kept another side of zone boundary.

Expected behavior Actually it must detect bottle removed

Screenshots Error : It does not detect anything regarding bottle removed. How to solve this error ?

Screenshot from 2024-02-26 14-38-47

Installation instructions (Please mark the checkbox) [ ] I followed the installation guide at https://github.com/openvinotoolkit/openvino_notebooks#-installation-guide to install the notebooks.

Environment information Please run python check_install.py in the _openvinonotebooks directory. If the output is NOT OK for any of the checks, please follow the instructions to fix that. If that does not work, or if you still encounter the issue, please paste the output of check_install.py here.

Additional context Add any other context about the problem here.

Abhijeet241093 commented 6 months ago

l-bat

Abhijeet241093 commented 6 months ago

catchygit

Abhijeet241093 commented 6 months ago

Any solution ?

adrianboguszewski commented 6 months ago

@riacheruvu please look at it

riacheruvu commented 6 months ago

Hi @Abhijeet241093, thanks for raising these issues! I'm looking into reproducing the issues from my end and will get back to you ASAP within the next week with a solution, which will most likely be a quick patch to the code snippet.

Please note that the object addition/removal can be faulty if the Yolov8 model is unable to clearly see and detect the object itself - if there are occlusions, that can cause issues. Could you confirm that when removing the bottle, the Yolov8 model is able to detect and overlay the bounding box on the output?

I understand issues #1765 and #1766 are grouped under the same theme - would it be ok if we looped these issues under this issue #1754? The patch I'll provide should resolve all three issues. Thank you for your patience and for the detailed error reports.

Abhijeet241093 commented 6 months ago

Hello @riacheruvu,

Greetings :)

Thank you for your kind reply.

Q. Could you confirm that when removing the bottle, the Yolov8 model is able to detect and overlay the bounding box on the output?

Answer : Yes, yolov8 model is able to detect and overlay the bounding box on the output.

Q. I understand issues https://github.com/openvinotoolkit/openvino_notebooks/issues/1765 and https://github.com/openvinotoolkit/openvino_notebooks/issues/1766 are grouped under the same theme - would it be ok if we looped these issues under this issue https://github.com/openvinotoolkit/openvino_notebooks/issues/1754?

Answer : Yes please, thank you.

Abhijeet241093 commented 6 months ago

Hello @riacheruvu,

Suppose the GPU and storage requirements are on the server side. If you downgrade from the YOLOv8 model to the YOLOv5 model, will the performance improve? Why do we choose YOLOv8 over YOLOv5? Is it for additional flexibility or features?

riacheruvu commented 6 months ago

Thank you, @Abhijeet241093, for the additional details - I'm working on it!

To your question, In this context, we are choosing YOLOv8 over YOLO5 for the performance improvements and being able to leverage the latest APIs. You could switch the model with YOLOv5 for meeting GPU/storage requirements - I will say I haven't validated this particular kit with YOLOv5 yet, so you may see different results if you try this.

Abhijeet241093 commented 5 months ago

Hello riacheruvu,

Greetings :)~

Logic is improved. Its fine now. Please find code, below. At this point, need answer of following questions.

  1. If we apply two separately yolovs model for person detection, and object detection, in that case, should we need to

A. Separately apply zone over it ?

B. Or Should we combine results of both, then apply zone over it ?

Define empty lists to keep track of labels

original_labels = [] final_labels = [] person_bbox = [] p_items = [] purchased_items = set(p_items) a_items = [] added_items = set(a_items) hand_bbox = [] combined_detections = []

Save result as det_tracking_result

with sv.VideoSink("new_det_tracking_result.mp4", video_info) as sink:

Iterate through model predictions and tracking results

for index, (result, result1) in enumerate(zip(model.track(source=VID_PATH, show=False, stream=True, verbose=True, persist=True),
                                           model1.track(source=VID_PATH, show=False, stream=True, verbose=True, persist=True))):
  #Define variables to store interactions that are refreshed per frame
  interactions = []
  person_intersection_str = ""

  # Obtain predictions from model1
  frame1 = result1.orig_img
  detections_objects1 = sv.Detections.from_ultralytics(result1)
  detections_objects1 = detections_objects1[detections_objects1.class_id == 0]
  bboxes1 = result1.boxes 
  #print(detections_objects1)

  #Obtain predictions from yolov8 model
  frame = result.orig_img
  detections = sv.Detections.from_ultralytics(result)
  detections = detections[detections.class_id < 10]
  bboxes = result.boxes 

  # Apply mask over the single Zone
  mask1, mask2 = zone.trigger(detections=detections_objects1), zone.trigger(detections=detections)
  detections_filtered1, detections_filtered2 = detections_objects1[mask1], detections[mask2]

  if detections_objects1 and len(detections_objects1) > 0:
     label1 = label_map1[detections_objects1.class_id[0]]  # Get the label for the class_id
     combined_detections.append((detections_objects1, label1))
     for detection, label in combined_detections:
         print("Detections:", detection)
         print("Label:", label)

  if bboxes1.id is not None:
     detections_objects1.tracker_id = bboxes1.id.cpu().numpy().astype(int)

  labels = [
      f'#{tracker_id} {label_map1[class_id]} {confidence:0.2f}'
      for _, _, confidence, class_id, tracker_id
      in detections_objects1
  ]

  #Print labels for detections from model1
  for _, _, confidence, class_id, _ in detections_objects1:
    print(f"Label: {label_map1[class_id]} with confidence: {confidence:.2f}")

  print(detections)
   # Apply mask over the single Zone
  mask = zone.trigger(detections=detections)
  detections_filtered = detections[mask]

  print("mask", mask)
  print("Detection", detections_filtered)

  if detections and len(detections) > 0:
     label = label_map[detections.class_id[0]]  # Get the label for the class_id
     combined_detections.append((detections, label))

  if bboxes.id is not None:
     detections.tracker_id = bboxes.id.cpu().numpy().astype(int)

  labels = [
      f'#{tracker_id} {label_map[class_id]} {confidence:0.2f}'
      for _, _, confidence, class_id, tracker_id
      in detections
  ]

  frame = box_annotator.annotate(scene=frame, detections=detections_filtered, labels=labels)
  frame = zone_annotator.annotate(scene=frame)

  objects = [f'#{tracker_id} {label_map[class_id]}' for _, _, confidence, class_id, tracker_id in detections]

#   for _, _, confidence, class_id, _ in detections:
#    print(f"Label: {label_map[class_id]} with confidence: {confidence:.2f}")

#   # Combine detections from both models
#   # combined_detections = np.concatenate((detections_objects1, detections))

#   print(combined_detections)

#   # Extract xyxy attributes from combined detections
#   combined_detections_xyxy = [detection[0].xyxy for detection in combined_detections]

#   print(combined_detections_xyxy)

#   # Check if combined_detections_xyxy is not empty and contains non-empty arrays
#   if combined_detections_xyxy and all(arr.size > 0 for arr in combined_detections_xyxy):
#      # Concatenate xyxy arrays into a single array
#      combined_xyxy_array = np.concatenate(combined_detections_xyxy, axis=0)
#   else:
#       combined_xyxy_array = np.empty((0, 4))  # Create an empty array

#   # Create a Detections object with the concatenated xyxy array
#   combined_detections_detections = sv.Detections(xyxy=combined_xyxy_array)

#   # Apply mask over the combined detections
#   mask = zone.trigger(detections= combined_detections_detections)

#   # Filter combined detections based on the mask
#   combined_detections_filtered = [combined_detections[i] for i in range(len(combined_detections)) if mask[i]]

#   # Print the mask and filtered detections
#   #print("Combined Detections mask:", mask)
#   #print("Combined Detections filtered:", combined_detections_filtered)

#   # Iterate through combined detections to create labels
#   combined_labels = []
#   for detection in combined_detections_filtered:
#       detections, label = detection
#       for _, _, confidence, class_id, tracker_id in detections:
#             combined_labels.append(f'#{tracker_id} {label_map1[class_id]} {confidence:.2f}')

#     # Print labels for combined detections
#   for label in combined_labels:
#         print("combined_labels", label)

#   frame = box_annotator.annotate(scene=frame, detections=combined_detections_filtered, labels=combined_labels)
#   frame = zone_annotator.annotate(scene=frame)

#   objects = [f'#{tracker_id} {label_map[class_id]}' for _, _, confidence, class_id, tracker_id in combined_detections_filtered]

#   print("Combined Objects:", objects)

  #If this is the first time we run the application,
  #store the objects' labels as they are at the beginning
  if index == 0:

      original_labels = objects
      original_dets = len(detections_filtered)

  else:
      #To identify if an object has been added or removed
      #we'll use the original labels and identify any changes
      final_labels = objects
      new_dets = len(detections_filtered)
      #Identify if an object has been added or removed using Counters
      removed_objects = Counter(original_labels) + Counter(final_labels)
      added_objects = Counter(final_labels) - Counter(original_labels)

      #Create two variables we can increment for drawing text
      draw_txt_ir = 1
      draw_txt_ia = 1

    #Check for objects being added or removed
      #if new_dets - original_dets != 0 and len(removed_objects) >= 1:
      if new_dets != original_dets or removed_objects:
         #An object has been removed
          for k,v in removed_objects.items():
             #For each of the objects, check the IOU between a designated object
             #and a person.

             if 'person' not in k:
                 removed_object_str = f"{v} {k} purchased"
                 removed_action_str = intersecting_bboxes(bboxes, bboxes1, person_bbox, removed_object_str)
                 print("Removed Action String:", removed_action_str)  # Add this line
                 if removed_action_str is not None:
                     log.info(removed_action_str)
                     #Add the purchased items to a "receipt" of sorts
                     item = removed_action_str.split()
                     if len(item) >= 3:
                         item = f"{item [0]} {item [1]} {item [2]}"  
                     removed_label = item.split(' ')[-1]
                     if any(removed_label in item for item in purchased_items):
                        purchased_items = {f"{int(item.split()[0]) + 1} {' '.join(item.split()[1:])}" if removed_label in item else item for item in purchased_items}
                     else:    
                         purchased_items.add(f"{v} {k}")
                         p_items.append(f" - {v} {k}") 
                 print("New_Purchased_Items:", purchased_items)
                 print("Removed_Objects:")
                 #Draw the result on the screen        
                 draw_text(frame, text=removed_action_str, point=(50, 50 + draw_txt_ir), color=(0, 0, 255))
                 draw_text(frame, "Receipt: " + str(purchased_items), point=(50, 800), color=(30, 144, 255))      
                 draw_txt_ir += 80

      if len(added_objects) >= 1:
          #An object has been added
          for k,v in added_objects.items():
              #For each of the objects, check the IOU between a designated object
              #and a person.
              if 'person' not in k:
                  added_object_str = f"{v} {k} returned"
                  added_action_str = intersecting_bboxes(bboxes, bboxes1, person_bbox, added_object_str)
                  print("Added Action String:", added_action_str)  # Add this line
                  if added_action_str is not None:
                      #If we have determined an interaction with a person,
                      #log the interaction.
                      log.info(added_action_str)
                      item = added_object_str.split()
                      if len(item) >= 3:
                         item = f"{item [0]} {item [1]} {item [2]}"     
                      item = item.split(' ')[-1]
                      if any(item in item for item in purchased_items):
                         purchased_items = {f"{int(item.split()[0]) - 1} {' '.join(item.split()[1:])}" if item in item else item for item in purchased_items}
                         if any(item.startswith('0 ') for item in purchased_items):
                            purchased_items = {item for item in purchased_items if not item.startswith('0 ')}
                      print("Updated_Purchased_Items:", purchased_items)
                          #p_items.remove(item)
                      added_items.add(added_object_str)
                      a_items.append(added_object_str)
                      print("Added_Objects:")
                  #Draw the result on the screen  
                  draw_text(frame, text=added_action_str, point=(50, 300 + draw_txt_ia), color=(0, 128, 0))
                  draw_text(frame, "Receipt: " + str(purchased_items), point=(50, 800), color=(30, 144, 255))
                  draw_txt_ia += 80

  # Clear the combined_detections list
  combined_detections.clear()               
  draw_text(frame, "Receipt: " + str(purchased_items), point=(50, 800), color=(30, 144, 255))
  sink.write_frame(frame)
Abhijeet241093 commented 5 months ago

Hi @Abhijeet241093, thanks for raising these issues! I'm looking into reproducing the issues from my end and will get back to you ASAP within the next week with a solution, which will most likely be a quick patch to the code snippet.

Please note that the object addition/removal can be faulty if the Yolov8 model is unable to clearly see and detect the object itself - if there are occlusions, that can cause issues. Could you confirm that when removing the bottle, the Yolov8 model is able to detect and overlay the bounding box on the output?

I understand issues #1765 and #1766 are grouped under the same theme - would it be ok if we looped these issues under this issue #1754? The patch I'll provide should resolve all three issues. Thank you for your patience and for the detailed error reports.

Have you done it ? @riacheruvu,

riacheruvu commented 5 months ago

Hello @Abhijeet241093, I sincerely apologize for my delayed response, I needed additional time to validate my patch - it'll be merged into the repository shortly in a day or two. There were a few new edge use cases that the patch also needed to address, which took time to incorporate.

To your question:

"If we apply two separately yolovs model for person detection, and object detection, in that case, should we need to A. Separately apply zone over it? B. Or Should we combine results of both, then apply zone over it?"

It would depend on the use case you are trying to achieve: I would highly recommend applying the zone only for the object detection model. If you are looking to use the intersection of zones, rather than the intersection of bounding boxes, for detecting changes in objects, then you could for example consider applying zone detection for your two individual YOLO models for person and object detection, and then consider the intersection/combination of the results. To briefly summarize, I would recommend considering option A. I hope my explanation makes sense - happy to clarify further if not!

Thank you for your patience. Once the patch is merged, I will close this issue and convert it to a discussion.

raymondlo84 commented 1 week ago

I'm closing this and reopen it under here: https://github.com/openvinotoolkit/openvino_build_deploy/issues