Closed mercurytoxic closed 4 years ago
Please post full debug logs. There are TPU locks that should prevent this. Make sure your tpu lock max is 1
Also you are using old versions. If you are using edge tpu you need to be on 6.0
Thanks for the quick reply, updating fixed the issue.
I kind of have the same issue. Container A is connected to the TPU and everything is okay. Once container B connects to TPU (loading the same object detection model to TPY), container A fails. I need to know where I can catch the error so it does not fail my container.
Are container a and container b both active and requesting access to the tpu at the same time?
Also is this mlapi or zmes? As mlapi will keep the tpu bound and a model loaded in memory.
Yes, both containers request access to the TPU once they are up. So, container A has access already and sets an interpreter variable, but once container B attempts to access to set its interpreter variable, it gets access but container A fails with this error:
2022/08/17 23:49:51 stderr: F driver/usb/usb_driver.cc:1148] HandleQueuedBulkIn transfer in failed. Not found: USB transfer error 5 [LibUsbDataInCallback]
2022/08/17 23:49:51 Forked function has terminated: signal: aborted (core dumped)
Neither of mlapi or zmes. I am just running my application as a python code in a Flask framework.
That is expected behavior I would think, my advice would be to make a container that is dedicated to the tpu and have the other containers make http requests to the dedicated container for detections.
Event Server version
v5.15.6.r57.gfc6d2b9
Hooks version
The version of ZoneMinder you are using:
v1.35.5
What is the nature of your issue
Bug
Details
When I try to send two object detections at the same time while using the Coral Edge TPU, I get a correct response from one and a
F :1147] HandleQueuedBulkIn transfer in failed. Not found: USB transfer error 5 [LibUsbDataInCallback]
from the other.This doesn't happen when I use YoloV4 with a GPU.
Here they mention that is due to multiple processing attempting to access the tpu at the same time.
The problem that this is causing is that when different cameras are triggered at the same time, not all of the cameras are processed by the detector.
To reproduce this I just run
Result:
Thanks!