google-coral / pycoral

Python API for ML inferencing and transfer-learning on Coral devices
https://coral.ai
Apache License 2.0
351 stars 145 forks source link

Runtime error at pipeline runner #43

Closed ykhorzon closed 6 months ago

ykhorzon commented 3 years ago

I followed the pipeline example model_pipelining_classify_image.py but it has runtime error

My environment is pycoral-2.0.0-cp38-cp38-linux_x86_64 and tflite_runtime-2.5.0.post1-cp38-cp38-linux_x86_64

python3 examples/model_pipelining_classify_image.py --models test_data/pipeline/inception_v3_299_quant_segment_%d_of_2_edgetpu.tflite --labels test_data/imagenet_labels.txt --input test_data/parrot.jpg

The runtime error is as following:

root@5c8d01ebd0fd:~# python3 examples/model_pipelining_classify_image.py   --models     test_data/pipeline/inception_v3_299_quant_segment_%d_of_2_edgetpu.tflite   --labels test_data/imagenet_labels.txt   --input test_data/parrot.jpg
Using devices:  ['pci:0', 'pci:1']
Using models:  ['test_data/pipeline/inception_v3_299_quant_segment_0_of_2_edgetpu.tflite', 'test_data/pipeline/inception_v3_299_quant_segment_1_of_2_edgetpu.tflite']
WARNING: Logging before InitGoogleLogging() is written to STDERR
I20210812 02:35:21.009770   102 pipelined_model_runner.cc:172] Thread: 139954348168960 receives empty request
I20210812 02:35:21.009820   102 pipelined_model_runner.cc:245] Thread: 139954348168960 is shutting down the pipeline...
I20210812 02:35:21.158869   102 pipelined_model_runner.cc:255] Thread: 139954348168960 Pipeline is off.
I20210812 02:35:21.159277   103 pipelined_model_runner.cc:207] Queue is empty and `StopWaiters()` is called.
-------RESULTS--------
macaw: 0.99609
Average inference time (over 5 iterations): 30.4ms
I20210812 02:35:21.160143    63 pipelined_model_runner.cc:172] Thread: 139954824456000 receives empty request
E20210812 02:35:21.160192    63 pipelined_model_runner.cc:240] Thread: 139954824456000 Pipeline was turned off before.
Exception ignored in: <function PipelinedModelRunner.__del__ at 0x7f49b8a0eee0>
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/pycoral/pipeline/pipelined_model_runner.py", line 83, in __del__
    self.push({})
  File "/usr/local/lib/python3.8/dist-packages/pycoral/pipeline/pipelined_model_runner.py", line 152, in push
    self._runner.Push(input_tensors)
RuntimeError: Pipeline was turned off before.
E20210812 02:35:21.161602    63 pipelined_model_runner.cc:240] Thread: 139954824456000 Pipeline was turned off before.
E20210812 02:35:21.161650    63 pipelined_model_runner.cc:147] Failed to shutdown status: INTERNAL: Pipeline was turned off before.

I guessed that the producer pushed empty dict {} to make runner to turn off. But consumer threading terminated by join() also cause runner destruction again.

# model_pipelining_classify_image.py 
 def producer():
    for _ in range(args.count):
      runner.push({name: image})
    runner.push({})

 def consumer():
    output_details = runner.interpreters()[-1].get_output_details()[0]
    scale, zero_point = output_details['quantization']
    while True:
      result = runner.pop()
      if not result:
        break
      values, = result.values()
      scores = scale * (values[0].astype(np.int64) - zero_point)
      classes = classify.get_classes_from_scores(scores, args.top_k,
                                                 args.threshold)
    print('-------RESULTS--------')
    for klass in classes:
      print('%s: %.5f' % (labels.get(klass.id, klass.id), klass.score))

  start = time.perf_counter()
  producer_thread = threading.Thread(target=producer)
  consumer_thread = threading.Thread(target=consumer)
  producer_thread.start()
  consumer_thread.start()
  producer_thread.join()
  consumer_thread.join()
  ...

How can I avoid this behavior? Any hint?

ykhorzon commented 3 years ago

I found that the error is not triggered by consumer_thread.join(). It is from def main() leading to runner destruction due to runner as local variable.

manoj7410 commented 3 years ago

@ykhorzon This looks like a bug on our side. Thank you for reporting. Will try to get this fixed soon.

artifact-evaluation commented 2 years ago

any updates on this?

hjonnala commented 2 years ago

@Bincci For workaround please try the below steps and try the demo:

  1. replace the model_pipelining_classify_image.py with https://github.com/hjonnala/snippets/blob/main/pycoral_43/model_pipelining_classify_image.py
  2. add pipeline_model_runneer.py (https://github.com/hjonnala/snippets/blob/main/pycoral_43/pipelined_model_runner.py) to pycoral/examples folder
Cen-Lu commented 2 years ago

@hjonnala

Hi, thanks a lot for the update! But I tried the new workaround and it still doesn't work. Here's the error that I got,

python3 examples/model_pipelining_classify_image.py --models test_data/pipeline/inception_v3_299_quantsegment%d_of_2_edgetpu.tflite --labels test_data/imagenet_labels.txt --input test_data/parrot.jpg --count 10 Using devices: ['pci:0', 'pci:1'] Using models: ['test_data/pipeline/inception_v3_299_quant_segment_0_of_2_edgetpu.tflite', 'test_data/pipeline/inception_v3_299_quant_segment_1_of_2_edgetpu.tflite'] WARNING: Logging before InitGoogleLogging() is written to STDERR I20220311 00:18:35.438598 72676 pipelined_model_runner.cc:172] Thread: 140038698432256 receives empty request I20220311 00:18:35.438637 72676 pipelined_model_runner.cc:245] Thread: 140038698432256 is shutting down the pipeline... I20220311 00:18:35.677592 72676 pipelined_model_runner.cc:255] Thread: 140038698432256 Pipeline is off. I20220311 00:18:35.677745 72677 pipelined_model_runner.cc:207] Queue is empty and StopWaiters() is called. -------RESULTS-------- macaw: 0.99609 Average inference time (over 10 iterations): 24.3ms I20220311 00:18:35.678028 72639 pipelined_model_runner.cc:172] Thread: 140039029487424 receives empty request E20220311 00:18:35.678045 72639 pipelined_model_runner.cc:240] Thread: 140039029487424 Pipeline was turned off before. I20220311 00:18:35.678164 72639 pipelined_model_runner.cc:207] Queue is empty and StopWaiters() is called. E20220311 00:18:35.678177 72639 pipelined_model_runner.cc:240] Thread: 140039029487424 Pipeline was turned off before. E20220311 00:18:35.678184 72639 pipelined_model_runner.cc:147] Failed to shutdown status: INTERNAL: Pipeline was turned off before.

Somehow it still raises "The pipeline was turned off before" bug.

artifact-evaluation commented 2 years ago

@hjonnala

thanks for your reply. I tried the workaround on my desktop but it still has the same error as before. From my understanding, the problem is that the processing order of input images is not consistent with the order they are pushed into the runner. In this case, the empty input may be processed earlier than expected so that the pipeline is closed before all the inputs have been processed.

This can be confirmed by doing a quick experiment: try to remove the last empty input in the producer, then it can be seen that no error message will come out but the consumer thread will never stop since it can not get out of the infinity loop.

Please let me know your thoughts on this. Thanks a lot!

hjonnala commented 2 years ago

@Bincci @Cen-Lu Thanks for trying the code. I thought this issue is for "RuntimeError: Pipeline was turned off before" coming from python scripts.

The other logs (pipeline was turned off before) coming from pipelined_model_runner.cc due to both threads are trying to close the pipeline. I think, its a trivial issue to work on.

artifact-evaluation commented 2 years ago

@hjonnala thanks! Now I have understood the issue here. Yes, I agree the remaining issue is trivial, but it would be still very helpful if you can share some ideas to solve the issue in c. If so, I can have a try by modifying the code and compiling it natively.

hjonnala commented 2 years ago

@Bincci I think, you just have to add logic here to come out of the function without invoking shutdownpipeline when thread receives empty request and pipeline is turned off.

https://github.com/google-coral/libcoral/blob/master/coral/pipeline/pipelined_model_runner.cc#L171

  if (input_tensors.empty()) {   #also check if pipeline_on_ true/false to invoke shutdownpipeline
    LOG(INFO) << "Thread: " << std::this_thread::get_id()
              << " receives empty request";
    return ShutdownPipeline();
  }
binqi-sun commented 2 years ago

@hjonnala thanks a lot!

google-coral-bot[bot] commented 6 months ago

Are you satisfied with the resolution of your issue? Yes No