Open Wallbraker opened 2 years ago
@Wallbraker thanks for the report. Does this issue happen when the application exits or even before reading the first message? By the looks of it, the message reading part looks okay, only freeing the data after it was copied into the message.
In this case however, it seems that XLink gets a disconnect request, which causes the freeing of the incoming data. (See freed by thread T49 here:
)
How do you manage the returned Device
object? It encapsulates the connection to the device and must be kept alive for the duration of the communication between the host & device.
@themarpe Thank you for your reply.
I looked into it and while we use a thread to get the images, we stop the thread before deleting the device.
I added a explicit call close
on the queue just before deleting the device the problem goes away. I would like to point out that none of your examples calls close on the queue so I do not expect this to be required.
@Wallbraker sorry for going around here - I think that this actually relates to some use case "bug" which was fixed in latest develop.
Can you retest without an explicit close
on latest develop
branch? (We'll be releasing 2.11.0 soon as well)
hello. I've isolated scenarios where definitely parseMessageToADatatype()
and possibly parseMessage()
are given a non-null streamPacketDesc_t* packet
But.....the struct it points to is {null, 0}
This causes readIntLE() to return nonsense values (in my case it always returns 171) which then incorrectly passes the "bad packet" test and then proceeds to index into unallocated memory and other things like try to initialize a std::vector from a nullptr.
The fix is to validate parameters. At the top of both those parsexxxx functions, put
if (!packet->data || !packet->length) {
throw std::runtime_error("Bad packet, couldn't parse");
}
This I've isolated and reproduced in MSVC. And looking at the stack in the OP it matches there also.
@Wallbraker there are bugs in depthai regarding threads and ownership. I've found the bugs with Connection
and you are right to question Device
. https://github.com/luxonis/depthai-core/issues/257
@themarpe I have so far not gotten this error v2.13.3
after removing the explicit queue->close();
call. Will do some more experiments and see I can catch it, and close this issue if not.
Spoke to soon, just got one. Basically the same backtraces.
I'm running the camera at 118
FPS, if that might help you trigger it.
I've already found the cause of this. It is issue #257
The fail asan lists in the OP
#11 0x7fbc566b3f5f in dai::StreamMessageParser::parseMessageToADatatype
is a direct result of the bugs of the issue #257. There is no fix other than to resolve that. You can workaround the problems using the branch I list in the issue. But it will only work on Windows because Linux doesn't have a way to catch SEGFAULTs while also maintaining C++ stack unwinds.
Ah thank you for the information, I'm currently working around it by closing the queue.
I'm running into a intermittent issue that asan found. I'm pretty sure I'm not at fault here, asan doesn't complain about anything else. And this issue seems to only happen with the RGB camera, not mono cameras.
This is with 2.10.0 compiled by myself.