Open twvd opened 7 years ago
Looks like you may have some strange packets coming in..
Since your length field only includes the 0 at the beginning of the payload, I'm guessing the rest of the packet is uninitialized memory from previous uses of the same packet memory. Does the first bytes of payload ([0x75, 0x61, 0xd0...]
) look familiar?
Having a single 0 byte AD data structure is technically legal according to the BLE spec, but shouldn't be produced by any mesh device. Could this be some other BLE device in your vicinity?
I'll mark this issue as a bug, as those packets shouldn't make it all the way to your application anyway. The correct fix is changing transport_control.c line 372 to
if (p_mesh_adv_data != NULL && p_mesh_adv_data->adv_data_length >= MESH_PACKET_ADV_OVERHEAD)
, alternatively a similar fix here
Since we produce quite a few BLE devices nearby in the building it could very well be a non-mesh device producing this. I'll apply the fix you described.
I'm curious though - what happens to packets where the first byte does exceed MESH_PACKET_ADV_OVERHEAD? Is there any secondary error checking to confirm the packet is actually a valid mesh packet to prevent the bogus data propagating over the mesh?
The radio control module uses the hardware CRC check as per the BLE specification, to remove any on-air bitflip errors, which are quite common. We also search for the correct AD type, with a Nordic-owned 16 bit service UUID, and ensure that we don't go beyond the max length (causing overflow) over in mesh_packet.c.
I've been receiving packets from rbc_mesh_event_get (RBC_MESH_EVENT_TYPE_CONFLICTING_VAL) with a valid register number, but an invalid data length (249) and an unknown BLE source address. This usually occurs with a high volume of traffic on the mesh.
To illustrate this, see the following packet (captured in vh_rx):
Call stack:![image](https://cloud.githubusercontent.com/assets/11619441/21457321/a9ac5f9c-c92e-11e6-81d9-33c5e3c0a962.png)
The BLE-address (c7 ca 45 65 2a 74) is unknown in my test setup. It is always the same, even though my test setup has about 12 nodes. Apparantly, this packet gets through all the error checks in the mesh code and ends up in the event queue with a data length of 249 because adv_data_length here is 0 and later the length is calculated like this in vh_rx:
Unfortunately, I haven't been able to trace this condition all the way back to radio_control (yet) to see if this comes out of the radio like this or it is memory corruption somewhere, because I'm not sure how to break the code there on this condition. However, because I can reproduce this easely and the packet index differs every time, I suspect it comes out of the radio like this.
So, my questions are; What could this packet be? What's up with the mysterious BLE address? How come the mesh puts this, seemingly invalid due to miscalculated length, packet into the event queue to be processed like nothing is wrong? How is the error checking done?