Open diablodale opened 2 years ago
Odd. Thanks for writing up. Asking.
I've been thinking...in firmware and hw...what is the generator/incrementor of the sequence numbers? If these numbers are somehow generated "virtually" by the sensors themselves, then if the output rate of multiple sensors are not exactly the same, then there will eventually be drift.
StereoDepth node...does it always wait for matching seq num frames? Or what happens if one mono camera is 29.8 fps and the second mono is 30.0fps? I programmatically set everything to 30fps. But....the 3A firmware can override this and change fps in certain situations.
In my repro above, there are three sensors...each perhaps slightly at a different rate and therefore will drift over time. The specific method by which sequence numbers are generated and assigned to the output of these three sensors may lead to OP behavior. Perhaps one (or two) of the mono sensors is not as light sensitive as the other, so 3A adjusts the exposure time which leads to slightly different fps.
I did not test my two OAK-D-lite sensors in this way before I calibrated one of them. So this lag (or exposure time?) may have always been there. Or perhaps the calibration code in the lite-calibration
branch changed some setting in that one OAK-D-Lite which is exposing this lag.
Thoughts? 🤔
I changed my code to use setQueueSize(1)
and setBlocking(false)
on all node inputs (nn, imgmanp, stereodepth, xout, etc.) as https://discord.com/channels/790680891252932659/924798503270625290/961835869352886292
No improvement. ☹ One OAK-D-Lite continues to drift drastically out of sync. I also noticed... Bad sync sensor sends seq num matched frames at almost 30fps Correct sync sensor sends seq num matched frames at ~25.5fps Yet both sensors run the same codepath in memory with exactly the same api calls and settings.🤪
I think 3A might be somehow involved in this seq num lag problem. To test this....
I set mono exposure and sensitivity to 28000us 1000iso, 28000us 1200iso, and 28000us 1600iso with color always setAutoExposureEnable(); setAutoExposureLock(false)
.
All of those quickly demonstrated the sync/lag problems as in the video.
Alternatively, I set color to 28000us 1600iso with mono always setAutoExposureEnable(); setAutoExposureLock(false)
.
🤯 No lag or sync problems with 10 minutes of testing.
StereoDepth::setDepthAlign(RECTIFIED_RIGHT)
is there any possibility that the firmware/hw needs or tracks color frames, dependent on their timing, etc.? The video above suggests there is a 1->16 frame buffer/queue somewhere. Even though I have set queue=1/noblock, it seems that there is still a queue somewhere that is holding old frames and writing incorrect sequence numbers on them. The video doesn't seem to have dropped frames until my console errors start happening (which indicates my host-side queue is now dropping). Up until that point...the video shows the progressive lag...frames that are all there...with a progressively larger amount of time old. I don't visibly see out-of-order frames or stuttering frames.
AFAIK: all 3 cameras are started at the same time, RGB camera frame time is adjusted periodically to do a soft synchronization compared to stereo cameras, only if RGB FPS matches stereo FPS. Stereo cameras are started through the broadcast address and kept in sync. The FPS of the mono cameras are determined by the FPS of the first MonoCamera node (limitation in current version). (E.g. setting first to 10, the second to 40 means that both will run at 10). Sequence number is incremented monotonically when a new frame is produced by the camera.
Questions
- How does Experiment-2 fixing color camera exposure and sensitivity lead to depth frames not stamped with old sequence numbers?
RGB frame time is adjusted by at most 2ms each time in the direction of mono timestamp (either up if lagging behind, or down if it's ahead), when the difference between RGB and mono frame is more than 70 us. This 2 ms maximum could explain the slow/inconsistent convergence together with exposure adjustment which alters frame time.
- When
StereoDepth::setDepthAlign(RECTIFIED_RIGHT)
is there any possibility that the firmware/hw needs or tracks color frames, dependent on their timing, etc.?
No. Stereo depth alignment doesn't depend at all on RGB camera, it depends only on RGB calibration, if RGB alignment is enabled.
- Two mono cameras feed into StereoDepth. If one of those two mono cameras doesn't give frames to StereoDepth at the same rate as its sibling, then how does StereoDepth behave?
They are forced to the same FPS as described above.
- Does it wait on matching seq numbers and discard newer until it finally gets the matching older frame?
There's an internal sync node based on timestamps, not sequence numbers.
- Does stereodepth process with a pair (one old, one new)? Will it substitute or duplicate inputs in any situation?
No, it always check left and right timestamp, if the difference between them is less than 16.666 ms (hardcoded in current implementation) then it's considered synced.
- And when (and if) StereoDepth processes any pair...which Sequence Number does it stamp on the depth and disparity frames? Newest, oldest, left, right? And if the two inputs are ever not exactly the same in seqnum and time, then the stamped seq number for output will be "incorrect" for a significant portion of situations.
If it's aligned to CENTER/RIGHT then right frame, else left frame. In case of RGB alignment it doesn't inherit the sequence number of RGB frame AFAIK.
@alex-luxonis can correct me on any of the above questions.
Hi. My Lite sensors continue to be problematic, and behave significantly different than all my other OAK sensors. I can easily repro this time/lag issue with my app. This seems to be something complex or hw so I've been trying to simplify....now I have something. 🧐
I've modified gen2-syncing\host-frame-sync.py
at https://github.com/diablodale/depthai-experiments/tree/repro-440 to reproduce a failure where the fps drops from its full rate almost 30fps down to ~3 fps.
This failure occurs 80% of the time on my OAK-D-Lite sensors.
It never occurs on my OAK-D.
Here is a video of the failure https://www.youtube.com/watch?v=NdiF_eFxCeg The stream of right-mono frames starts ok for maybe a second then quickly reduces to ~3fps and never recovers. I tried 3 different usb-c cables (2 were usb-a, 1 was usb-c both sides) and different ports on computer. Same failures on all.
I don't see anything obvious in debug log. This was taken during this 3fps failure...
[2022-06-13 20:44:19.700] [debug] Python bindings - version: 2.15.5.0 from 2022-05-28 22:43:57 +0200 build: 2022-05-28 21:31:38 +0000
[2022-06-13 20:44:19.701] [debug] Library information - version: 2.15.5, commit: cedff3f6419e07dca5fd9c5a087432f665613db8 from 2022-05-28 22:40:44 +0200, build: 2022-05-28 21:31:23 +0000
[2022-06-13 20:44:19.703] [debug] Initialize - finished
[2022-06-13 20:44:19.722] [debug] Device - OpenVINO version: 2021.4
[2022-06-13 20:44:19.722] [debug] Device - BoardConfig: {"gpio":[],"network":{"mtu":0,"xlinkTcpNoDelay":true},"pcieInternalClock":null,"sysctl":[],"uart":[],"usb":{"flashBootedPid":63037,"flashBootedVid":999,"maxSpeed":4,"pid":63035,"vid":999},"usb3PhyInternalClock":null,"watchdogInitialDelayMs":null,"watchdogTimeoutMs":null}
libnop:
0000: b9 09 b9 05 81 e7 03 81 3b f6 81 e7 03 81 3d f6 04 b9 02 00 01 ba 00 be be bb 00 bb 00 be be
[2022-06-13 20:44:19.788] [debug] Resources - Archive 'depthai-bootloader-fwp-0.0.18+c555ac2fb184b801291c95f7f73d23bf4dd42cf1.tar.xz' open: 2ms, archive read: 83ms
[2022-06-13 20:44:20.059] [debug] Resources - Archive 'depthai-device-fwp-12ab42f2923f30cedbf129fbb765ae7ca973bdba.tar.xz' open: 2ms, archive read: 353ms
[1844301041CE6C1200] [10.245] [system] [info] Memory Usage - DDR: 0.12 / 341.19 MiB, CMX: 2.05 / 2.50 MiB, LeonOS Heap: 7.22 / 78.13 MiB, LeonRT Heap: 2.88 / 41.43 MiB
[1844301041CE6C1200] [10.245] [system] [info] Temperatures - Average: 35.59 ┬░C, CSS: 36.30 ┬░C, MSS 35.59 ┬░C, UPA: 35.11 ┬░C, DSS: 35.35 ┬░C
[1844301041CE6C1200] [10.245] [system] [info] Cpu Usage - LeonOS 9.62%, LeonRT: 2.38%
[2022-06-13 20:44:22.615] [debug] Schema dump: {"connections":[{"node1Id":1,"node1Output":"out","node1OutputGroup":"","node2Id":0,"node2Input":"in","node2InputGroup":""}],"globalProperties":{"calibData":null,"cameraTuningBlobSize":null,"cameraTuningBlobUri":"","leonCssFrequencyHz":700000000.0,"leonMssFrequencyHz":700000000.0,"pipelineName":null,"pipelineVersion":null,"xlinkChunkSize":-1},"nodes":[[0,{"id":0,"ioInfo":[[["","in"],{"blocking":true,"group":"","name":"in","queueSize":8,"type":3,"waitForMessage":true}]],"name":"XLinkOut","properties":[185,3,136,0,0,128,191,189,5,114,105,103,104,116,0]}],[1,{"id":1,"ioInfo":[[["","inputControl"],{"blocking":true,"group":"","name":"inputControl","queueSize":8,"type":3,"waitForMessage":false}],[["","out"],{"blocking":false,"group":"","name":"out","queueSize":8,"type":0,"waitForMessage":false}],[["","raw"],{"blocking":false,"group":"","name":"raw","queueSize":8,"type":0,"waitForMessage":false}]],"name":"MonoCamera","properties":[185,5,185,20,0,3,0,185,3,0,0,0,185,5,0,0,0,0,0,185,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,255,3,136,0,0,240,65]}]]}
[2022-06-13 20:44:22.616] [debug] Asset map dump: {"map":{}}
[1844301041CE6C1200] [10.258] [MonoCamera(1)] [info] Using board socket: 2, id: 2
[1844301041CE6C1200] [10.258] [system] [info] SIPP (Signal Image Processing Pipeline) internal buffer size '16384'B
[1844301041CE6C1200] [10.293] [system] [info] MonoCamera allocated resources: no shaves; cmx slices: [13-15]
[1844301041CE6C1200] [11.246] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [11.246] [system] [info] Temperatures - Average: 36.00 ┬░C, CSS: 37.01 ┬░C, MSS 35.35 ┬░C, UPA: 35.59 ┬░C, DSS: 36.06 ┬░C
[1844301041CE6C1200] [11.246] [system] [info] Cpu Usage - LeonOS 11.96%, LeonRT: 5.04%
[1844301041CE6C1200] [12.247] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [12.247] [system] [info] Temperatures - Average: 36.00 ┬░C, CSS: 37.24 ┬░C, MSS 35.59 ┬░C, UPA: 35.83 ┬░C, DSS: 35.35 ┬░C
[1844301041CE6C1200] [12.247] [system] [info] Cpu Usage - LeonOS 0.40%, LeonRT: 0.18%
[1844301041CE6C1200] [13.248] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [13.248] [system] [info] Temperatures - Average: 35.53 ┬░C, CSS: 35.83 ┬░C, MSS 35.83 ┬░C, UPA: 35.11 ┬░C, DSS: 35.35 ┬░C
[1844301041CE6C1200] [13.248] [system] [info] Cpu Usage - LeonOS 0.88%, LeonRT: 0.21%
[1844301041CE6C1200] [14.249] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [14.249] [system] [info] Temperatures - Average: 35.88 ┬░C, CSS: 36.77 ┬░C, MSS 35.35 ┬░C, UPA: 35.59 ┬░C, DSS: 35.83 ┬░C
[1844301041CE6C1200] [14.249] [system] [info] Cpu Usage - LeonOS 0.40%, LeonRT: 0.19%
[1844301041CE6C1200] [15.250] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [15.250] [system] [info] Temperatures - Average: 35.82 ┬░C, CSS: 37.48 ┬░C, MSS 34.88 ┬░C, UPA: 35.35 ┬░C, DSS: 35.59 ┬░C
[1844301041CE6C1200] [15.250] [system] [info] Cpu Usage - LeonOS 0.74%, LeonRT: 0.22%
[1844301041CE6C1200] [16.251] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [16.251] [system] [info] Temperatures - Average: 36.12 ┬░C, CSS: 37.24 ┬░C, MSS 35.59 ┬░C, UPA: 35.59 ┬░C, DSS: 36.06 ┬░C
[1844301041CE6C1200] [16.251] [system] [info] Cpu Usage - LeonOS 0.31%, LeonRT: 0.18%
[1844301041CE6C1200] [17.252] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [17.252] [system] [info] Temperatures - Average: 36.24 ┬░C, CSS: 37.48 ┬░C, MSS 36.06 ┬░C, UPA: 35.83 ┬░C, DSS: 35.59 ┬░C
[1844301041CE6C1200] [17.252] [system] [info] Cpu Usage - LeonOS 0.78%, LeonRT: 0.23%
[1844301041CE6C1200] [18.253] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [18.253] [system] [info] Temperatures - Average: 36.06 ┬░C, CSS: 37.24 ┬░C, MSS 35.59 ┬░C, UPA: 35.83 ┬░C, DSS: 35.59 ┬░C
[1844301041CE6C1200] [18.253] [system] [info] Cpu Usage - LeonOS 0.31%, LeonRT: 0.18%
[1844301041CE6C1200] [19.254] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [19.254] [system] [info] Temperatures - Average: 36.24 ┬░C, CSS: 37.24 ┬░C, MSS 34.88 ┬░C, UPA: 36.30 ┬░C, DSS: 36.53 ┬░C
[1844301041CE6C1200] [19.254] [system] [info] Cpu Usage - LeonOS 0.92%, LeonRT: 0.21%
[1844301041CE6C1200] [20.255] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [20.255] [system] [info] Temperatures - Average: 36.30 ┬░C, CSS: 37.24 ┬░C, MSS 36.06 ┬░C, UPA: 36.06 ┬░C, DSS: 35.83 ┬░C
[1844301041CE6C1200] [20.255] [system] [info] Cpu Usage - LeonOS 0.49%, LeonRT: 0.19%
[1844301041CE6C1200] [21.256] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [21.256] [system] [info] Temperatures - Average: 36.12 ┬░C, CSS: 37.48 ┬░C, MSS 35.11 ┬░C, UPA: 35.59 ┬░C, DSS: 36.30 ┬░C
[1844301041CE6C1200] [21.256] [system] [info] Cpu Usage - LeonOS 0.84%, LeonRT: 0.22%
[1844301041CE6C1200] [22.257] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [22.257] [system] [info] Temperatures - Average: 36.24 ┬░C, CSS: 36.77 ┬░C, MSS 36.30 ┬░C, UPA: 35.59 ┬░C, DSS: 36.30 ┬░C
[1844301041CE6C1200] [22.257] [system] [info] Cpu Usage - LeonOS 0.36%, LeonRT: 0.18%
[1844301041CE6C1200] [23.258] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [23.258] [system] [info] Temperatures - Average: 36.30 ┬░C, CSS: 36.77 ┬░C, MSS 36.30 ┬░C, UPA: 35.83 ┬░C, DSS: 36.30 ┬░C
[1844301041CE6C1200] [23.258] [system] [info] Cpu Usage - LeonOS 0.78%, LeonRT: 0.23%
[1844301041CE6C1200] [24.259] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [24.259] [system] [info] Temperatures - Average: 36.30 ┬░C, CSS: 37.24 ┬░C, MSS 35.59 ┬░C, UPA: 36.06 ┬░C, DSS: 36.30 ┬░C
[1844301041CE6C1200] [24.259] [system] [info] Cpu Usage - LeonOS 0.31%, LeonRT: 0.18%
[1844301041CE6C1200] [25.260] [system] [info] Memory Usage - DDR: 2.12 / 341.19 MiB, CMX: 2.07 / 2.50 MiB, LeonOS Heap: 20.93 / 78.13 MiB, LeonRT Heap: 3.31 / 41.43 MiB
[1844301041CE6C1200] [25.260] [system] [info] Temperatures - Average: 36.30 ┬░C, CSS: 36.53 ┬░C, MSS 35.59 ┬░C, UPA: 36.77 ┬░C, DSS: 36.30 ┬░C
[1844301041CE6C1200] [25.260] [system] [info] Cpu Usage - LeonOS 0.66%, LeonRT: 0.18%
[2022-06-13 20:44:37.927] [debug] Device about to be closed...
[2022-06-13 20:44:38.045] [debug] DataOutputQueue (right) closed
[2022-06-13 20:44:38.045] [debug] Log thread exception caught: Couldn't read data from stream: '__log' (X_LINK_ERROR)
[2022-06-13 20:44:38.045] [debug] Timesync thread exception caught: Couldn't read data from stream: '__timesync' (X_LINK_ERROR)
[2022-06-13 20:44:38.513] [debug] XLinkResetRemote of linkId: (0)
[2022-06-13 20:44:38.562] [debug] Device closed, 634
Traceback (most recent call last):
File "C:\repos-nobackup\depthai-experiments\gen2-syncing\host-frame-sync.py", line 156, in <module>
ps.add_packet(q[c].tryGet(), c)
KeyboardInterrupt
^C
Thanks for the heads up on this and detailed report. I"m not sure if you intended to have a video linked as well. Just mentioning in case that was missed.
@alex-luxonis - do we need to enable hardware sync on OV7251 maybe?
Oops, added video link to above post.
Thanks. We're fighting the good fight on some other underlying things (namely some firmware flashing regressions) but we will be digging into this ASAP.
Any update on this @alex-luxonis ? multi_cam_support
?
I have evidence of one of my two OAK-D-Lite sensors sending frames of data to the host where the timestamps are progressively out-of-sync (even more than 1 second) yet have the same sequence number.
I received these two sensors in the same kickstarter pledge. The only thing I know is different, is that the sensor with the progressive lag was calibrated with the script from
lite-calibration
branch. My other D-Lite has its factory calibration.Setup
Repro
It is in my C++ app, so the setup is complex. I have not been able to isolate this lag issue using python. Here is a general description.
Result
My 1st OAK-D-Lite that was calibrated has problems. The visual inspection of the depth and color frames starts in sync. But in less than a minute, the depth is visually lagging. The depth will visually progressively lag more and more behind the color. Video at https://www.youtube.com/watch?v=ITx32-ZfTak
The console output shows the time delta of the sync'd frames progressively increase. Yet these depth+color paired frames have the same sequence number.
My 2nd OAK-D-Lite which was not calibrated does not have problems. Visually the frames are synchronized and the time diff shows slight oscillation arounds zero. e.g.
Expected
No lag.
Notes