luxonis / depthai-python

DepthAI Python Library
MIT License
360 stars 193 forks source link

X_LINK_ERROR #408

Closed Ulfzerk closed 2 years ago

Ulfzerk commented 3 years ago

Hello, I have an issue with running tiny-yolo-v4 with SpatialDetection. I'm using copy+pasted demo from: https://docs.luxonis.com/projects/api/en/latest/samples/SpatialDetection/spatial_tiny_yolo/ With tiny yolo blob: https://artifacts.luxonis.com/artifactory/luxonis-depthai-data-local/network/tiny-yolo-v4_openvino_2021.2_6shave.blob My only change was adding print(...) with iteration counter, fps and average chip temperature.

My device is: OAK-D-CM4 device url: https://shop.luxonis.com/products/depthai-rpi-compute-module-4-edition

depthAi version: 2.11.1 Installed by: python3 -m pip install git+https://github.com/luxonis/depthai-python.git@caf537b without using venv.

Python python3 --version Python 3.7.3 Raspi system informations

$ cat /etc/os-release
PRETTY_NAME="Raspbian GNU/Linux 10 (buster)"
NAME="Raspbian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
$ uname -a

Linux oakdev 5.10.17-v7l+ #1421 SMP Thu May 27 14:00:13 BST 2021 armv7l GNU/Linux

Error messages: File "yolo_detection_test_sp.py", line 151, in <module> boundingBoxMapping = xoutBoundingBoxDepthMappingQueue.get() RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'boundingBoxDepthMapping' (X_LINK_ERROR)' or on custom code RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'RGB' (X_LINK_ERROR)'

Temperatures Average chip temperature: 85C Raspi temperature: 77C

Code

#!/usr/bin/env python3

from pathlib import Path
import sys
import cv2
import depthai as dai
import numpy as np
import time

'''
Spatial Tiny-yolo example
  Performs inference on RGB camera and retrieves spatial location coordinates: x,y,z relative to the center of depth map.
  Can be used for tiny-yolo-v3 or tiny-yolo-v4 networks
'''

# Get argument first
nnBlobPath = str((Path(__file__).parent / Path('tiny-yolo-v4_openvino_2021.2_6shave.blob')).resolve().absolute())
if 1 < len(sys.argv):
    arg = sys.argv[1]
    if arg == "yolo3":
        nnBlobPath = str((Path(__file__).parent / Path('../models/yolo-v3-tiny-tf_openvino_2021.4_6shave.blob')).resolve().absolute())
    elif arg == "yolo4":
        nnBlobPath = str((Path(__file__).parent / Path('../models/yolo-v4-tiny-tf_openvino_2021.4_6shave.blob')).resolve().absolute())
    else:
        nnBlobPath = arg
else:
    print("Using Tiny YoloV4 model. If you wish to use Tiny YOLOv3, call 'tiny_yolo.py yolo3'")

if not Path(nnBlobPath).exists():
    import sys
    raise FileNotFoundError(f'Required file/s not found, please run "{sys.executable} install_requirements.py"')

# Tiny yolo v3/4 label texts
labelMap = [
    "person",         "bicycle",    "car",           "motorbike",     "aeroplane",   "bus",           "train",
    "truck",          "boat",       "traffic light", "fire hydrant",  "stop sign",   "parking meter", "bench",
    "bird",           "cat",        "dog",           "horse",         "sheep",       "cow",           "elephant",
    "bear",           "zebra",      "giraffe",       "backpack",      "umbrella",    "handbag",       "tie",
    "suitcase",       "frisbee",    "skis",          "snowboard",     "sports ball", "kite",          "baseball bat",
    "baseball glove", "skateboard", "surfboard",     "tennis racket", "bottle",      "wine glass",    "cup",
    "fork",           "knife",      "spoon",         "bowl",          "banana",      "apple",         "sandwich",
    "orange",         "broccoli",   "carrot",        "hot dog",       "pizza",       "donut",         "cake",
    "chair",          "sofa",       "pottedplant",   "bed",           "diningtable", "toilet",        "tvmonitor",
    "laptop",         "mouse",      "remote",        "keyboard",      "cell phone",  "microwave",     "oven",
    "toaster",        "sink",       "refrigerator",  "book",          "clock",       "vase",          "scissors",
    "teddy bear",     "hair drier", "toothbrush"
]

syncNN = True

# Create pipeline
pipeline = dai.Pipeline()

# Define sources and outputs
camRgb = pipeline.create(dai.node.ColorCamera)
spatialDetectionNetwork = pipeline.create(dai.node.YoloSpatialDetectionNetwork)
monoLeft = pipeline.create(dai.node.MonoCamera)
monoRight = pipeline.create(dai.node.MonoCamera)
stereo = pipeline.create(dai.node.StereoDepth)

xoutRgb = pipeline.create(dai.node.XLinkOut)
xoutNN = pipeline.create(dai.node.XLinkOut)
xoutBoundingBoxDepthMapping = pipeline.create(dai.node.XLinkOut)
xoutDepth = pipeline.create(dai.node.XLinkOut)

xoutRgb.setStreamName("rgb")
xoutNN.setStreamName("detections")
xoutBoundingBoxDepthMapping.setStreamName("boundingBoxDepthMapping")
xoutDepth.setStreamName("depth")

# Properties
camRgb.setPreviewSize(416, 416)
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
camRgb.setInterleaved(False)
camRgb.setColorOrder(dai.ColorCameraProperties.ColorOrder.BGR)

monoLeft.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
monoLeft.setBoardSocket(dai.CameraBoardSocket.LEFT)
monoRight.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
monoRight.setBoardSocket(dai.CameraBoardSocket.RIGHT)

# setting node configs
stereo.initialConfig.setConfidenceThreshold(255)

spatialDetectionNetwork.setBlobPath(nnBlobPath)
spatialDetectionNetwork.setConfidenceThreshold(0.5)
spatialDetectionNetwork.input.setBlocking(False)
spatialDetectionNetwork.setBoundingBoxScaleFactor(0.5)
spatialDetectionNetwork.setDepthLowerThreshold(100)
spatialDetectionNetwork.setDepthUpperThreshold(5000)

# Yolo specific parameters
spatialDetectionNetwork.setNumClasses(80)
spatialDetectionNetwork.setCoordinateSize(4)
spatialDetectionNetwork.setAnchors(np.array([10,14, 23,27, 37,58, 81,82, 135,169, 344,319]))
spatialDetectionNetwork.setAnchorMasks({ "side26": np.array([1,2,3]), "side13": np.array([3,4,5]) })
spatialDetectionNetwork.setIouThreshold(0.5)

# Linking
monoLeft.out.link(stereo.left)
monoRight.out.link(stereo.right)

camRgb.preview.link(spatialDetectionNetwork.input)
if syncNN:
    spatialDetectionNetwork.passthrough.link(xoutRgb.input)
else:
    camRgb.preview.link(xoutRgb.input)

spatialDetectionNetwork.out.link(xoutNN.input)
spatialDetectionNetwork.boundingBoxMapping.link(xoutBoundingBoxDepthMapping.input)

stereo.depth.link(spatialDetectionNetwork.inputDepth)
spatialDetectionNetwork.passthroughDepth.link(xoutDepth.input)
counter_all=0
# Connect to device and start pipeline
with dai.Device(pipeline) as device:

    # Output queues will be used to get the rgb frames and nn data from the outputs defined above
    previewQueue = device.getOutputQueue(name="rgb", maxSize=4, blocking=False)
    detectionNNQueue = device.getOutputQueue(name="detections", maxSize=4, blocking=False)
    xoutBoundingBoxDepthMappingQueue = device.getOutputQueue(name="boundingBoxDepthMapping", maxSize=4, blocking=False)
    depthQueue = device.getOutputQueue(name="depth", maxSize=4, blocking=False)

    startTime = time.monotonic()
    counter = 0
    fps = 0
    color = (255, 255, 255)

    while True:
        counter_all+=1
        print(counter_all,fps,device.getChipTemperature().average)
        inPreview = previewQueue.get()
        inDet = detectionNNQueue.get()
        depth = depthQueue.get()

        frame = inPreview.getCvFrame()
        depthFrame = depth.getFrame()
        depthFrameColor = cv2.normalize(depthFrame, None, 255, 0, cv2.NORM_INF, cv2.CV_8UC1)
        depthFrameColor = cv2.equalizeHist(depthFrameColor)
        depthFrameColor = cv2.applyColorMap(depthFrameColor, cv2.COLORMAP_HOT)

        counter+=1
        current_time = time.monotonic()
        if (current_time - startTime) > 1 :
            fps = counter / (current_time - startTime)
            counter = 0
            startTime = current_time

        detections = inDet.detections
        if len(detections) != 0:
            boundingBoxMapping = xoutBoundingBoxDepthMappingQueue.get()
            roiDatas = boundingBoxMapping.getConfigData()

            for roiData in roiDatas:
                roi = roiData.roi
                roi = roi.denormalize(depthFrameColor.shape[1], depthFrameColor.shape[0])
                topLeft = roi.topLeft()
                bottomRight = roi.bottomRight()
                xmin = int(topLeft.x)
                ymin = int(topLeft.y)
                xmax = int(bottomRight.x)
                ymax = int(bottomRight.y)

                cv2.rectangle(depthFrameColor, (xmin, ymin), (xmax, ymax), color, cv2.FONT_HERSHEY_SCRIPT_SIMPLEX)

        # If the frame is available, draw bounding boxes on it and show the frame
        height = frame.shape[0]
        width  = frame.shape[1]
        for detection in detections:
            # Denormalize bounding box
            x1 = int(detection.xmin * width)
            x2 = int(detection.xmax * width)
            y1 = int(detection.ymin * height)
            y2 = int(detection.ymax * height)
            try:
                label = labelMap[detection.label]
            except:
                label = detection.label
            cv2.putText(frame, str(label), (x1 + 10, y1 + 20), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)
            cv2.putText(frame, "{:.2f}".format(detection.confidence*100), (x1 + 10, y1 + 35), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)
            cv2.putText(frame, f"X: {int(detection.spatialCoordinates.x)} mm", (x1 + 10, y1 + 50), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)
            cv2.putText(frame, f"Y: {int(detection.spatialCoordinates.y)} mm", (x1 + 10, y1 + 65), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)
            cv2.putText(frame, f"Z: {int(detection.spatialCoordinates.z)} mm", (x1 + 10, y1 + 80), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)

            cv2.rectangle(frame, (x1, y1), (x2, y2), color, cv2.FONT_HERSHEY_SIMPLEX)

        cv2.putText(frame, "NN fps: {:.2f}".format(fps), (2, frame.shape[0] - 4), cv2.FONT_HERSHEY_TRIPLEX, 0.4, color)
        cv2.imshow("depth", depthFrameColor)
        cv2.imshow("rgb", frame)

        if cv2.waitKey(1) == ord('q'):
            break
madgrizzle commented 3 years ago

Does this happen right away or after a period of time while its running?

Ulfzerk commented 3 years ago

Does this happen right away or after a period of time while its running?

After some time, like 8k iterations with detections as I can remember.

madgrizzle commented 3 years ago

I have posted a similar issue https://github.com/luxonis/depthai-experiments/issues/210 It runs for a while then dies with similar error. I was thinking it was imagemanip related but your code doesn't use it. So I'm not sure anymore.

Ulfzerk commented 3 years ago

While running this example demo with DEPTHAI_LEVEL = DEBUG

[14442C1091D82CD700] [471.530] [system] [info] Memory Usage - DDR: 74.00 / 359.07 MiB, CMX: 2.34 / 2.50 MiB, LeonOS Heap: 46.90 / 78.63 MiB, LeonRT Heap: 5.28 / 23.84 MiB 

[14442C1091D82CD700] [471.530] [system] [info] Temperatures - Average: 88.75 \u00b0C, CSS: 89.16 \u00b0C, MSS 88.44 \u00b0C, UPA: 90.24 \u00b0C, DSS: 87.17 \u00b0C 

[14442C1091D82CD700] [471.530] [system] [info] Cpu Usage - LeonOS 52.70%, LeonRT: 31.19% 

[14442C1091D82CD700] [472.532] [system] [info] Memory Usage - DDR: 74.00 / 359.07 MiB, CMX: 2.34 / 2.50 MiB, LeonOS Heap: 46.90 / 78.63 MiB, LeonRT Heap: 5.28 / 23.84 MiB 

[14442C1091D82CD700] [472.532] [system] [info] Temperatures - Average: 88.66 \u00b0C, CSS: 88.80 \u00b0C, MSS 88.80 \u00b0C, UPA: 90.60 \u00b0C, DSS: 86.43 \u00b0C 

[14442C1091D82CD700] [472.532] [system] [info] Cpu Usage - LeonOS 52.81%, LeonRT: 31.61% 

[2021-11-03 12:32:56.426] [debug] Log thread exception caught: Couldn't read data from stream: '__log' (X_LINK_ERROR) 

[2021-11-03 12:32:56.429] [debug] Timesync thread exception caught: Couldn't read data from stream: '__timesync' (X_LINK_ERROR) 

[2021-11-03 12:32:56.447] [debug] Device about to be closed... 

[2021-11-03 12:32:56.675] [debug] Watchdog thread exception caught: Couldn't write data to stream: '__watchdog' (X_LINK_ERROR) 

[2021-11-03 12:32:58.352] [debug] XLinkResetRemote of linkId: (0) 

[2021-11-03 12:32:58.356] [debug] Device closed, 1908 

Traceback (most recent call last): 

  File "yolo_detection_test_sp.py", line 130, in <module> 

    inPreview = previewQueue.get() 

RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'rgb' (X_LINK_ERROR)' 

I will try usb2Mode=True and higher delay withcv2.waitKey(..) update: It didn't help

Luxonis-Brandon commented 3 years ago

We are working on what could be the same underlying issue. Not 100% sure though. @themarpe is on it.

Ulfzerk commented 3 years ago

We are working on what could be the same underlying issue. Not 100% sure though. @themarpe is on it.

I appreciate it very much. If I may ask, is it hardware or software problem? Is it very complicated? How long is this bugfix estimated for? Is there anything I can do to help?

madgrizzle commented 3 years ago

@BlonskiP I ran it on mine (straight copy from the repo because I didn't have much time to try your version) and after an hour it was still running. Temps got up to ~58C. I don't think its the ImageManip issue and maybe it is heat related if you get into the 80C range.

Ulfzerk commented 3 years ago

@BlonskiP I ran it on mine (straight copy from the repo because I didn't have much time to try your version) and after an hour it was still running. Temps got up to ~58C. I don't think its the ImageManip issue and maybe it is heat related if you get into the 80C range.

@madgrizzle Thank you very much. I will try to get my hands on raspberry fun as soon as I can anyway ;)

madgrizzle commented 3 years ago

Maybe a low-profile fan on the heatsink on the other side where the cameras are since it seems that part is getting really hot.

Luxonis-Brandon commented 3 years ago

So in terms of the heat of the DepthAI module - 85C is not a problem. The DepthAI SoM can run indefinitely at 105C die temperature. That said, I'm not sure if the Pi temperature could be an issue.

Thoughts on this one @themarpe ?

themarpe commented 3 years ago

Hi @BlonskiP and @madgrizzle I have tried reproducing the issue yesterday on x86-64 host yesterday but didn't succeed.

I am a bit behind on this issue (and the one @madgrizzle brought up), but I think same issue happens on device side. A memory corruption, which causes the device to crash. Problem is that its non-deterministic and rarely occurs at same place let alone at same time, so its really challenging to catch the source of the issue.

Initial guess was ImageManip, as there is a lot of complexity there, which could cause such a bug, but its not common between these two issues.

I'm still wrapping up ImageManip improvements and I'll attack this instability issues next week.

Regarding HW vs SW issue, the one I observed testing @madgrizzle issue, it was SW, but in this case not sure if maybe host has any influcence (can you reproduce on x86-64 @BlonskiP ?). That said, I still lean to this being the same SW bug.

madgrizzle commented 3 years ago

@themarpe I ran the yolo script as well on an x86-host (the same one I use for the gen2-triangulation) and had no problems... it ran for four hours without a glitch. I had spun up a RPI4 when testing gen2-triangulation issues so I started the yolo script on it during my lunch break and will check it tonight.

madgrizzle commented 3 years ago

It ran for ~6 hours with no problem on the RPI4.

madgrizzle commented 3 years ago

@themarpe I added some additional code (from the gen2-triangulation) for face detection and let it run overnight. When I got up, it had crashed with rgb stream error. There was no face detections going on (no one in room.. except maybe a ghost?) so the imagemanip crop shouldn't have been called. I'll try to run the original script for a long time as well and see if it crashes. I do think there's a memory corruption problem going on as you suspect.. which is unfortunate as that's one of and the hardest types of bugs to find (and I'm assuming on the closed-source side of things as well).

Finally, when I took this script and added cropping, face rotation, and face reidentification to the pipeline as well, it crashed within about 20 seconds. Seems when the pipeline gets busy, the crashing happens quicker.

madgrizzle commented 3 years ago

After 12 hours of running the original script on an RPI, I got this:

image

SzabolcsGergely commented 3 years ago

Cross posting: https://github.com/luxonis/depthai-experiments/issues/210#issuecomment-962650050

Ulfzerk commented 2 years ago

It looks like this fix has increased stability, but this error still occurs :(

themarpe commented 2 years ago

Hi @BlonskiP Latest develop includes some additional stability improvements - feel free to test those out.

hipitihop commented 2 years ago

I can confirm I get this error on the following environment:

host: Ubuntu 20.04.3 LTS with AMD CPU camera1: Oak1 camera2: Oak-D Lite connection: usb c cable (provided with oak-d lite)

Testing with gen2-face-recognition master branch as of today. python3 main.py --name someone

camera1: fails after a few seconds of recognition. sometimes sgows saving... but not always. camera2: fails almost immediately but same

python3 main.py --name frog
Creating pipeline...
Creating Color Camera...
Creating Face Detection Neural Network...
Creating Head pose estimation NN
Creating face recognition ImageManip/NN
[14442C10D12853D000] [8.516] [NeuralNetwork(10)] [warning] Network compiled for 4 shaves, maximum available 13, compiling for 6 shaves likely will yield in better performance
[14442C10D12853D000] [8.763] [NeuralNetwork(10)] [warning] The issued warnings are orientative, based on optimal settings for a single network, if multiple networks are running in parallel the optimal settings may vary
Saving face...
Saving face...
Saving face...
Saving face...
[14442C10D12853D000] [17.464] [system] [critical] Fatal error. Please report to developers. Log: 'class' '374'
Traceback (most recent call last):
  File "main.py", line 254, in <module>
    frameIn = frameQ.tryGet()
RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'frame' (X_LINK_ERROR)'

similar error when just running the main as opposed to training with --name

themarpe commented 2 years ago

@hipitihop can you try using a latest develop library? The posted issue looks similar to the one that we've recently made some fixes for. Which library version are you currently using? (run as DEPTHAI_LEVEL=debug python3 main.py --name frog)

hipitihop commented 2 years ago

@themarpe Library: Depthai version installed: 2.14.1.0.dev+27fa4519f289498e84768ab5229a1a45efb7e4df

My current setup is as follows: ~/development/depthai/ - master git rev-parse HEAD: 08756f77e885b58421d6d4678782720d3b9f638d ~/development/depthai/depthai-experiments/ - master git rev-parse HEAD: 7382e8be7308e3aab537842dfa17a49f532d03b5

Debug logs attached:

debug-log-oak-d-lite.txt debug-log-oak1.txt

Updated: let me know if this is an incorrect folder structure. I run the install requirements from the top level but run the experiment from within the depthai-experiments/gen2-facial-recognition also, given this setup, tell me which branch you want me to test with.

Updated: I now see that I'm using the main demo repo depthai as opposed to this repo depthai-python. My bad. Not sure what the difference is. Apologies for my muddling

themarpe commented 2 years ago

@hipitihop The face recognition experiment still has some issues on latest depthai library. Can you install the one specified in the folder along side it gen2-face-recognition/requirements.txt (version 2.10) I think that should work better.

We're looking into this bug in the meantime.

hipitihop commented 2 years ago

@themarpe

Indeed with the Oak1 this does not crash. It does not seem to do any saving, but this might just need me to clear previous data to start fresh for a given name --name frog

As for the Oak-D Lite: with DEPTHAI_LEVEL=debug python3 main.py --name frog it complains about finding the camera but continues, however it never displays the window, but is happy to continue reporting temp, cpu, mem each second.

[2022-01-19 09:39:32.620] [debug] Python bindings - version: 2.10.0.0 from 2021-08-24 18:49:37 +0300 build: 2021-08-24 17:52:17 +0000
[2022-01-19 09:39:32.620] [debug] Library information - version: 2.10.0, commit: 57bb84ad209825f181744f2308b8ac6f52a37604 from 2021-08-24 18:49:14 +0300, build: 2021-08-24 17:43:07 +0000
[2022-01-19 09:39:32.623] [debug] Initialize - finished
Creating pipeline...
Creating Color Camera...
Creating Face Detection Neural Network...
Creating Head pose estimation NN
Creating face recognition ImageManip/NN
[2022-01-19 09:39:32.687] [debug] Resources - Archive 'depthai-bootloader-fwp-0.0.12.tar.xz' open: 1ms, archive read: 62ms
[2022-01-19 09:39:33.056] [debug] Resources - Archive 'depthai-device-fwp-7131affa2c01ecd34506e9c3dd8ea9198ed874f1.tar.xz' open: 1ms, archive read: 431ms
[2022-01-19 09:39:33.074] [debug] Device - OpenVINO version: 2021.2
[2022-01-19 09:39:33.080] [debug] Patching OpenVINO FW version from 2021.4 to 2021.2
[18443010A1D10A1300] [11.280] [system] [info] Memory Usage - DDR: 0.12 / 358.55 MiB, CMX: 2.09 / 2.50 MiB, LeonOS Heap: 6.26 / 77.56 MiB, LeonRT Heap: 2.83 / 23.94 MiB
[18443010A1D10A1300] [11.280] [system] [info] Temperatures - Average: 37.71 °C, CSS: 39.35 °C, MSS 36.77 °C, UPA: 37.94 °C, DSS: 36.77 °C
[18443010A1D10A1300] [11.280] [system] [info] Cpu Usage - LeonOS 7.40%, LeonRT: 2.06%
....
[18443010A1D10A1300] [11.722] [system] [error] Attempted to start Color camera - NOT detected!
[18443010A1D10A1300] [11.418] [system] [info] ImageManip internal buffer size '203904'B, shave buffer size '20480'B
[18443010A1D10A1300] [11.418] [system] [info] SIPP (Signal Image Processing Pipeline) internal buffer size '156672'B
[18443010A1D10A1300] [11.418] [system] [info] NeuralNetwork allocated resources: shaves: [0-12] cmx slices: [0-12] 
[18443010A1D10A1300] [11.418] [system] [info] ColorCamera allocated resources: no shaves; cmx slices: [13-15] 
[18443010A1D10A1300] [11.418] [system] [info] ImageManip allocated resources: shaves: [15-15] no cmx slices. 
[18443010A1D10A1300] [11.432] [NeuralNetwork(10)] [info] Needed resources: shaves: 4, ddr: 1605632

[18443010A1D10A1300] [11.432] [NeuralNetwork(10)] [warning] Network compiled for 4 shaves, maximum available 13, compiling for 6 shaves likely will yield in better performance
[18443010A1D10A1300] [11.722] [system] [error] Attempted to start Color camera - NOT detected!
[18443010A1D10A1300] [11.475] [DetectionNetwork(3)] [info] Needed resources: shaves: 6, ddr: 2728832

[18443010A1D10A1300] [11.707] [NeuralNetwork(7)] [info] Needed resources: shaves: 6, ddr: 21632

[18443010A1D10A1300] [11.721] [NeuralNetwork(10)] [warning] The issued warnings are orientative, based on optimal settings for a single network, if multiple networks are running in parallel the optimal settings may vary
[18443010A1D10A1300] [11.721] [NeuralNetwork(10)] [info] Inference thread count: 2, number of shaves allocated per thread: 4, number of Neural Compute Engines (NCE) allocated per thread: 1
[18443010A1D10A1300] [11.722] [DetectionNetwork(3)] [info] Inference thread count: 2, number of shaves allocated per thread: 6, number of Neural Compute Engines (NCE) allocated per thread: 1
[18443010A1D10A1300] [11.723] [NeuralNetwork(7)] [info] Inference thread count: 2, number of shaves allocated per thread: 6, number of Neural Compute Engines (NCE) allocated per thread: 1
[18443010A1D10A1300] [12.281] [system] [info] Memory Usage - DDR: 143.71 / 358.55 MiB, CMX: 2.47 / 2.50 MiB, LeonOS Heap: 16.87 / 77.56 MiB, LeonRT Heap: 7.29 / 23.94 MiB
[18443010A1D10A1300] [12.281] [system] [info] Temperatures - Average: 38.94 °C, CSS: 40.28 °C, MSS 38.65 °C, UPA: 38.65 °C, DSS: 38.18 °C
[18443010A1D10A1300] [12.281] [system] [info] Cpu Usage - LeonOS 13.06%, LeonRT: 59.15%
[18443010A1D10A1300] [13.282] [system] [info] Memory Usage - DDR: 143.71 / 358.55 MiB, CMX: 2.47 / 2.50 MiB, LeonOS Heap: 16.87 / 77.56 MiB, LeonRT Heap: 7.29 / 23.94 MiB
Erol444 commented 2 years ago

Hello @hipitihop, I believe that is a different issue - OAK-D-Lite uses camera sensors that weren't compatible with the firmware before ~2.11. So OAK-D-Lite using depthai 2.10 on any pipeline will error out with the same issue - camera not found.

jasonm189 commented 2 years ago

Hello. I tried latest release 2.15.1.0, but the crash still happens. Is there an upcoming fix for this issue?

themarpe commented 2 years ago

@BlonskiP we've observed that CM4 suffers from an thermal issue on USB hub chip. Can you share more details of your unit? CC: @Luxonis-David

themarpe commented 2 years ago

@jasonm189

Can you share more details, minimum reproducible example script and the log of the run with DEPTHAI_LEVEL=debug enabled?

jasonm189 commented 2 years ago

@themarpe I used this example as it is. https://github.com/luxonis/depthai-experiments/tree/master/gen2-face-recognition

Luxonis-David commented 2 years ago

@jasonm189 which Luxonis camera/product you are using running the examples on? Is it OAK-D-CM4-PoE or some other camera i.e. OAK-D_Lite?

jasonm189 commented 2 years ago

@jasonm189 which Luxonis camera/product you are using running the examples on? Is it OAK-D-CM4-PoE or some other camera i.e. OAK-D_Lite?

OAK-D. The issue happens only with that example, from what I've read it's a known issue with script node. Do you have a list where known issues can be tracked?

Luxonis-David commented 2 years ago

@Erol444 on the above if you have anything like tracking list of issues or you can help with the example.

Erol444 commented 2 years ago

@jasonm189 which Luxonis camera/product you are using running the examples on? Is it OAK-D-CM4-PoE or some other camera i.e. OAK-D_Lite?

OAK-D. The issue happens only with that example, from what I've read it's a known issue with script node. Do you have a list where known issues can be tracked?

@jasonm189 there was a sporadic error before we changed the script nodes CPU: script.setProcessor(dai.ProcessorType.LEON_CSS) After that change, it hasn't crashed anymore. Are you using the latest version of depthai-experiments?

jasonm189 commented 2 years ago

@jasonm189 which Luxonis camera/product you are using running the examples on? Is it OAK-D-CM4-PoE or some other camera i.e. OAK-D_Lite?

OAK-D. The issue happens only with that example, from what I've read it's a known issue with script node. Do you have a list where known issues can be tracked?

@jasonm189 there was a sporadic error before we changed the script nodes CPU: script.setProcessor(dai.ProcessorType.LEON_CSS) After that change, it hasn't crashed anymore. Are you using the latest version of depthai-experiments?

Yes, it still crashes after 30+ mins.

Luxonis-Brandon commented 2 years ago

@jasonm189 - Sorry about the trouble. And actually given that your setup seems to be the only remaining crashing case here could you make a new issue so we can have all the details of the setup in one place? And then tag @Erol444 and me in it (and this issue)?

kazyam53 commented 2 years ago

On my environment, tiny_yolo v4 sample can work at depthai liberary ver2.14 but from 2.15 it can't. On 2.18, it caused error below

[184430102152AC1200] [1.7] [281.387] [system] [warning] ColorCamera IMX214: capping FPS for selected resolution to 35
Traceback (most recent call last):
  File "tiny_yolo.py", line 151, in <module>
    inRgb = qRgb.get()
RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'rgb' (X_LINK_ERROR)'
themarpe commented 2 years ago

@kazyam53 Do you mind opening a separate issue and also describe which device (I assume OAK-D Lite) and which host you are using? Thanks!

kazyam53 commented 2 years ago

@themarpe Ok I opened new issue below. https://github.com/luxonis/depthai-python/issues/691

themarpe commented 2 years ago

Addressed by https://github.com/luxonis/depthai-core/pull/616

Reran gen2-face-detection in experiments over night, ran for 7h without issues