google-coral / tutorials

Colab/Jupyter tutorials about training TensorFlow models for Edge TPU, and other tutorials
Apache License 2.0
177 stars 39 forks source link

model pipelining error for object detection #17

Closed vincent-jyq closed 2 years ago

vincent-jyq commented 2 years ago

Description

I'm trying to use model pipelining to speed up object detection with EfficientDet-Lite3. I was able to compile the segmented tflite models for TPU.

I modified my code based on:

  1. The basic [detect_image.py: https://github.com/google-coral/pycoral/blob/master/examples/detect_image.py
  2. classification example from pycoral: https://github.com/google-coral/pycoral/blob/master/examples/model_pipelining_classify_image.py

While running the code, I'm getting below error:

'test_data/efficientdet_lite2_448_ptq_segment_1_of_2_edgetpu.tflite'] WARNING: Logging before InitGoogleLogging() is written to STDERR I20211115 15:15:52.759135 10693 pipelined_model_runner.cc:172] Thread: 140160201991936 receives empty request I20211115 15:15:52.759192 10693 pipelined_model_runner.cc:245] Thread: 140160201991936 is shutting down the pipeline... Exception in thread Thread-2: Traceback (most recent call last): File "/home/yjiang/miniconda3/envs/TPU6/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/yjiang/miniconda3/envs/TPU6/lib/python3.6/threading.py", line 864, in run self._target(*self._args, self._kwargs) File "examples/detect_image_pipeline.py", line 183, in consumer result = runner.pop() File "/home/yjiang/pycoral/pyco/pipeline/pipelined_model_runner.py", line 170, in pop result = {k: v.reshape(self._output_shapes[k]) for k, v in result.items()} File "/home/yjiang/pycoral/pyco/pipeline/pipelined_model_runner.py", line 170, in result = {k: v.reshape(self._output_shapes[k]) for k, v in result.items()} ValueError: cannot reshape array of size 400 into shape (1,25,4)**

Comparing to the output sensors, the segmented model has exactly the same shapes, this is what I see with netron: image

However, actual output size seems inflated when using runner.pop(), (instead of 100 foat32, I'm getting 400)

I attached my code and tflite models, should be fairly easy to reproduce this issue. You should be able to use the same setup for pycoral. https://github.com/google-coral/pycoral

On top of that, copy detect_image_pipline.py to example folder and the segmented models to test_data folder, run: python3 examples/detect_image_pipeline.py --models test_data/efficientdet_lite2_448_ptqsegment%d_of_2_edgetpu.tflite --labels test_data/coco_labels.txt --input test_data/grace_hopper.bmp --output ${HOME}/grace_hopper_processed.bmp

efficientdet_lite2_448_ptq_segmented.zip detect_image_pipeline.zip e

Click to expand! ### Issue Type Bug ### Operating System Linux, Ubuntu ### Coral Device M.2 Accelerator B+M ### Other Devices _No response_ ### Programming Language Python 3.6 ### Relevant Log Output ```shell >'test_data/efficientdet_lite2_448_ptq_segment_1_of_2_edgetpu.tflite'] WARNING: Logging before InitGoogleLogging() is written to STDERR I20211115 15:15:52.759135 10693 pipelined_model_runner.cc:172] Thread: 140160201991936 receives empty request I20211115 15:15:52.759192 10693 pipelined_model_runner.cc:245] Thread: 140160201991936 is shutting down the pipeline... Exception in thread Thread-2: Traceback (most recent call last): File "/home/yjiang/miniconda3/envs/TPU6/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/yjiang/miniconda3/envs/TPU6/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "examples/detect_image_pipeline.py", line 183, in consumer result = runner.pop() File "/home/yjiang/pycoral/pyco/pipeline/pipelined_model_runner.py", line 170, in pop result = {k: v.reshape(self._output_shapes[k]) for k, v in result.items()} File "/home/yjiang/pycoral/pyco/pipeline/pipelined_model_runner.py", line 170, in result = {k: v.reshape(self._output_shapes[k]) for k, v in result.items()} **ValueError: cannot reshape array of size 400 into shape (1,25,4)** ```
hjonnala commented 2 years ago

@vincent-jyq please try model_pipelining_detect_image.py by placing pipeline.py and detect.py in the same folder.

vincent-jyq commented 2 years ago

@hjonnala , thanks for looking into this. Yes, your new code works like a charm! I think now I have a better understanding on how the running could work in parallel with interpreter's output tensor. One minor thing in your https://github.com/hjonnala/pycoral/blob/master/examples/model_pipelining_detect_image.py code:

        try:
          objs = detect.get_objects(output_interpreter, input_interpreter=input_interpreter, score_threshold=args.threshold,
                                    image_scale=scale)
        except Exception as Error:
          print(f"object detection error:{str(Error)}")

        result = runner.pop()
        if not result:
            break

Should "runner.pop()" in front of "detect.get_objects"?

vincent-jyq commented 2 years ago

@hjonnala , one thing I noticed after implementing this change to my real project. Seems like I sporadically miss detection if I leave consumer and producer threads free-running. Below is a short description with code snippet for the producer thread:

  def producer():
    j=0
    for i in val_imgs:
      j+=1
      image_file = test_image_dir + i
      image = Image.open(image_file)
      new_image, _ = resize_image(
          input_interpreter, image.size, lambda size: image.resize(size, Image.ANTIALIAS))
      runner.push({name: new_image})
      time.sleep(0.06)
    print('Number of images pushed from producer: ', j)
    runner.push({})

My project uses two segmentation based on pre-trained efficientdet_lite3_512 model. I was using profiling based partitioning so the two segmented models should have similar latency. However, if I let consumer and producer thread free running, even though I can reach ~18FPS(impressive result BTW), but I keep missing detection. By miss detection I mean nothing gets detected on a given frame that would otherwise be easily detected with this additional delay(as you can see with the sleep in my code). This issue happened sporadically(not just beginning or the end) throughout my datasets with ~300 frames.

I also tried to configure input and output queue size of the runner to see if the queue management was causing this. but even by setting:

  runner.set_input_queue_size(1)
  runner.set_output_queue_size(1)

I'm not seeing the FPS to slow down(still around 18fps). Let me know if I described my issue clear and hope you can reproduce it on your end.

hjonnala commented 2 years ago

Should "runner.pop()" in front of "detect.get_objects"?

yes, runner.pop() should be in front of "detect.get_objects". Actually, I am trying to get the objs fromresult = runner.pop() instead of detect.get_objects.

vincent-jyq commented 2 years ago

Should "runner.pop()" in front of "detect.get_objects"?

yes, runner.pop() should be in front of "detect.get_objects". Actually, I am trying to get the objs fromresult = runner.pop() instead of detect.get_objects.

That would be great, thought it had to be this way as a temporary workaround. Can you take a look at the missing detection issue as well?

hjonnala commented 2 years ago

@vincent-jyq please try model_pipelining_detect_image.py by placing pipeline.py and detect.py in the same folder.

I have modified the code to get the objs from results. Please try the latest code, you might have to modify this line as per the model. These changes might resolve missing detection issue as well. please try if the issue not resolved please share the image dataset.

Thanks

vincent-jyq commented 2 years ago

@hjonnala , I tried your new detection code, works like a charm. The missing detection issue also goes away. Can you elaborate on the problem? BTW, your code helps me understand better on the data structure of .pop result. Thanks a lot!

google-coral-bot[bot] commented 2 years ago

Are you satisfied with the resolution of your issue? Yes No

hjonnala commented 2 years ago

@vincent-jyq Thanks for confirming.

result = runner.pop() gives us output tensors from the last interpreter, i.e. that's the final inference results. For detection model there should be 4 tensors: scores, boxes, count, class_ids. So we need to get them (in the right order) from the result variable.

The issue **ValueError: cannot reshape array of size 400 into shape (1,25,4)** is with parsing out the results. To figure out how to parse I've compared the result = runner.pop() with boxes,scores,class_ids and count from detect.get_objects and came up with this function.