NordicSemiconductor / Android-nRF-Mesh-Library

The Bluetooth Mesh Provisioner and Configurator library.
https://www.nordicsemi.com/
BSD 3-Clause "New" or "Revised" License
414 stars 177 forks source link

MeshParserUtils.getOpCode() flips the bytes of the vendor id. #455

Closed FluffyBunniesTasteTheBest closed 1 year ago

FluffyBunniesTasteTheBest commented 2 years ago

Describe the bug In NCS, when defining a custom opcode, for example BT_MESH_MODEL_OP_3(0x01, 0x22ee), the result is a three byte opcode with value 0xc122ee. When the library's onMeshMessageReceived() gets called, VendorModelMessageStatus.getOpCode() returns an opcode of 0xc1ee22, in which the two bytes of the vendor or id (0x22ee) are flipped.

The bug happens in MeshParserUtils.getOpCode(), which is not aware of that Java and Bluetooth use different byte orders:

To Reproduce Steps to reproduce the behavior:

  1. Create a custom model in NCS (The chat example should work well).
  2. Send a VendorModelMessageAcked message to the custom model.
  3. When onMeshMessageReceived() is called, VendorModelMessageStatus.getOpCode() returns an opcode that has the two bytes of the vendor id flipped.

Expected behavior OpCode should not change.

Platform details:

Note: The way it looks, the convert-24-bit-methods in MeshParserUtils are facing the same issue.

roshanrajaratnam commented 2 years ago

Hi, you are correct java stores multibyte values in big endian format. However IMHO the 3 byte opcode for a vendor model should be depicted according to the specification. Therefore when you call VendorModelMessageStatus.getOpCode() should return the opcode as mentioned in the last paragraph of section 3.7.3.1 in the Mesh profile specification 1.0.1.

FluffyBunniesTasteTheBest commented 2 years ago

Thanks, can you please provide a link to the specs?

But even without looking into the specs, I doubt that a value of 0xc122ee is supposed to become 0xc1ee22. See, right now you have to use two different opcodes: 0xc122ee when sending and 0xc1ee22 when receiving, but shouldn't the opcode be the same, regardless if you send or receive data?!?

roshanrajaratnam commented 2 years ago

Here's the link to the spec. https://www.bluetooth.com/specifications/specs/mesh-profile-1-0-1/

The behavior I mentioned is what's expected due to the company identifier being a 16-bit value where the Endianness gets applied. When sending/receiving also the opcode is set to 0xc1ee22 internally just to save the hassle of doing it yourself. You can log the access pdu to check this.

But you do have a point here, I will fix this for consistency purposes so that you could get the OpCode and the ModelIdentifier separately for status messages.

FluffyBunniesTasteTheBest commented 2 years ago

@roshanrajaratnam Thanks for looking into this. Allow me to explain in a little more detail:

From section 3.7.3.1 of the specs:

For example, when the manufacturer-specific opcode is equal to 0x23 and the company identifier is equal to 0x0136 [6], then the 3-octet opcode is equal to 0xE3 0x36 0x01.

(Please note: Only the 16-bit vendor id is little endian).

Now, when this buffer (containing 0xE3 0x36 0x01) gets handled in MeshParserUtils.getOpCode(), the following happens:

return unsignedByteToInt(accessPayload[1]) << 8
    | unsignedByteToInt(accessPayload[0]) << 16
    | unsignedByteToInt(accessPayload[2]);

It just copies the buffer into an integer using the same byte order as the buffer, causing the two bytes of the company id to get flipped because of the difference in endian-ness.

Ironically enough, MeshParserUtilsTest compensates for the bug:

 public void testGetOpCode_vendorOpCode() {
        byte[] payload = MeshParserUtils.toByteArray("C1F105");;

        int calculatedOpCode = getOpCode(payload, 3);

        assertEquals(0xC1F105, calculatedOpCode);
    }

(Note: assertEquals(0xC1F105, calculatedOpCode); should test for value 0xc105f1)

From a general point of view, the problem is that the byte order between the sending and the receiving end is different. When a buffer is parsed without taking the difference in byte order into account, multi byte values inside the buffer will be corrupted. The only way to fix this is by properly parsing the data that comes in/goes out.

roshanrajaratnam commented 2 years ago

So basically when both sending and receiving the order should be as follows 0xE3 0x36 0x01 and MeshParserUtils.getOpCode() will now return the opcode on its own without the company identifier in place. It's not sent properly the vendor model message will fail and will not get a respond which does not happen. The issue was that I was simply returning the full 3-octet opcode instead of the message opcode itself. Also the previously returned opcode maintained the correct endianness as it should.

I hope the latest fix works for you.

FluffyBunniesTasteTheBest commented 2 years ago

So basically when both sending and receiving the order should be as follows 0xE3 0x36 0x01

Yes, because this matches section 3.7.3.1 in the Mesh profile specification 1.0.1

MeshParserUtils.getOpCode() will now return the opcode on its own without the company identifier in place.

Please not, that would be wrong. The opcode must contain the company identifier, otherwise we're having difficulties matching a reply to a (vendor) model.

Let's ignore the implementation details for a moment and take a look at it from a simpler angle: Let's imagine the firmware uses an opcode with a decimal value of 14.890.497. You would expect that MeshParserUtils.getOpCode() return the same value, but right now, that's not the case.

It's not sent properly the vendor model message will fail and will not get a respond which does not happen.

Sorry, not talking about sending to the firmware, talking about receiving the reply. But in general, the byte order must be considered when exchanging data between systems with different endian-ness.