Closed ghost closed 1 year ago
Hi Rom4in thanks for the detailed report. I will try to take a look at this to see where the issue might be. In the mean time could you check if you are having the same problem on version 2.4.1?
Not sure for 2.4.1, but we are using 2.3.0 on our mainline app. And yes, no problems on this version
I tried to make code comparison between 3.1.5 and 2.3.0, but too much changes 😅 And I don't really know where to look in the codebase
Hello @roshanrajaratnam :) any updates ?
Hey sorry, been busy with some other tasks. Unfortunately i won't be back to work until Tuesday due to national holidays in Norway. I'll try to take a look at this next week!
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: R0m4in-dooz @.> Sent: Friday, May 14, 2021 12:27:58 PM To: NordicSemiconductor/Android-nRF-Mesh-Library @.> Cc: Rajaratnam, Roshan @.>; Mention @.> Subject: Re: [NordicSemiconductor/Android-nRF-Mesh-Library] Segmentation feedback of acknowledged msg is mostly failing when sent to non-proxy device (#426)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNordicSemiconductor%2FAndroid-nRF-Mesh-Library%2Fissues%2F426%23issuecomment-841158540&data=04%7C01%7Croshan.rajaratnam%40nordicsemi.no%7Cdb2f0299f99b4136b12e08d916c2f292%7C28e5afa2bf6f419a8cf6b31c6e9e5e8d%7C0%7C0%7C637565848823132309%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=k9v5vR%2Bq7ainPfdlb6EzudIOZYCYQsT9Qz7%2BjiKLRd0%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACCK6DOVN7VECYUVSXCTPW3TNT3K5ANCNFSM44KKHPZQ&data=04%7C01%7Croshan.rajaratnam%40nordicsemi.no%7Cdb2f0299f99b4136b12e08d916c2f292%7C28e5afa2bf6f419a8cf6b31c6e9e5e8d%7C0%7C0%7C637565848823132309%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Ki%2F%2FGHMT7DDAG2twe4x4uGdbtdYZbKcSDTCJkDz%2FKw8%3D&reserved=0.
@R0m4in-dooz I have had some training programs yesterday and today so I have not had time yet to look in to your issue. I will try my best to have a look at this during this week.
Thanks @roshanrajaratnam, if I may help, with other logs or giving you our custom msg, feel free to ask :)
Could you also confirm if this is happening with the nRF SDK examples as well?
If you mean with the nrf example app, yes it has the same behavior: sometimes it's ok, sometimes it's ok after a retransmission, sometimes it never constructs the feedback
yes example app with the nRF SDK Mesh light/switch examples?
🤔 i only use our firmwares, based on nrf mesh sdk 4.0
And I reproduced with publication set msg, with provisioned nodes, no configuration
Hi @R0m4in-dooz finally found sometime to look at your issue. However looking at your logs and also after some debugging, I notice that "Message reassembly may not be completed yet!" is logged only when the message is not completed or when the reassembly might be in progress. If you notice when this is logged not all expected segments are received.
Yeh, sometimes I expect this log but:
in compositiondataget_KO.log --> after receiving the 2nd segment which was missing, I indeed get the status, but after that I again receive something. Can you explain ?
V/NetworkLayer(19824): TTL for received message: 99
V/NetworkLayer(19824): Dst: 0x0001
V/NetworkLayer(19824): Received a segmented access message from: 0002
V/LowerTransportLayer(19824): SEG O: 2
V/LowerTransportLayer(19824): SEG N: 5
V/LowerTransportLayer(19824): Last SeqAuth value 511
V/LowerTransportLayer(19824): Current SeqAuth value 511
V/LowerTransportLayer(19824): Received segment message count: 6
V/BlockAcknowledgementMessage(19824): Block ack value: 3f
V/BlockAcknowledgementMessage(19824): Block ack: 63
V/BlockAcknowledgementMessage(19824): bit count: 5
V/LowerTransportLayer(19824): Block acknowledgement payload: 07FC0000003F
V/LowerTransportLayer(19824): Creating unsegmented transport control
V/LowerTransportLayer(19824): Unsegmented Lower transport control PDU 0007FC0000003F
V/NetworkLayer(19824): Encryption key: 531F92A8F77688718BB6855F2E35F772
V/NetworkLayer(19824): Privacy key: 73F0C6B5D7B3A2FB34F31A738AF2AD15
V/NetworkLayer(19824): Sequence Number: 0001DF
V/NetworkLayer(19824): Encrypted Network payload: 186235E82A8E46B432454AB4044B4B8822
V/MeshMessageState(19824): Sending acknowledgement: 0019F6A7E69E6BAA186235E82A8E46B432454AB4044B4B8822
D/DoozMeshStatusCallbacks(19824): onBlockAcknowledgementProcessed
V/AccessLayer(19824): Received Access PDU 02005900000000002800070000000400000002000210010000000400021000100310011000000400021000100310011000000300021003100110000001000210
V/ConfigCompositionDataStatus(19824):
[...]
D/fr.dooz.nordic_nrf_mesh.DoozMeshManagerCallbacks(19824): onNetworkUpdated
D/DoozMeshStatusCallbacks(19824): onMeshMessageReceived
D/DoozMeshStatusCallbacks(19824): received a ConfigCompositionDataStatus
V/MeshManagerApi(19824): MeshNetwork pdu sent: 0x0019F6A7E69E6BAA186235E82A8E46B432454AB4044B4B8822
V/MeshManagerApi(19824): Received network pdu: 0x00198C03B0B32F69B8E2E2D9ADC72551ED3F4D82ABC77139EB
V/BaseMeshMessageHandler(19824): TTL for received message: 4
V/BaseMeshMessageHandler(19824): Sequence number of received access message: 479
V/NetworkLayer(19824): TTL for received message: 4
V/NetworkLayer(19824): Received a control message that was not directed to us, so we drop it
V/DefaultNoOperationMessageState(19824): Message reassembly may not be completed yet!
is it the answer from onBlockAcknowledgementProcessed
? then why it says not directed to us + msg not completed yet ?
in compositiondataget_KO_2.log --> i never have my status even if all segments have been logged at some point (0,1,3,4,5 and then 1,2,3,5)
in configmodelpublicationset_KO.log --> I have my status after this is logged : seg 0, seg 1, seg 1, seg 0 and still after the Status is constructed i have the same as compositiondataget_KO.log
in configmodelpublicationset_KO_2.log --> I have my status after this is logged : seg 1, seg 0, seg 0, seg 0 seg 1 and still after the Status is constructed i have the same as compositiondataget_KO.log
in configmodelpublicationset_full_KO.log --> i never have my status even if seg 0 and seg 1 are logged and in the end it says
V/MeshMessageState(23796): Sending acknowledgement: 00194768EF39053AB8E96F3BD1CF16713C329C54EA78F30997
D/DoozMeshStatusCallbacks(23796): onBlockAcknowledgementProcessed
V/MeshManagerApi(23796): Received network pdu: 0x00193237EADEAC3D2F3243ED98B1A236B06FF03963AF866DCB
V/BaseMeshMessageHandler(23796): TTL for received message: 4
V/BaseMeshMessageHandler(23796): Sequence number of received access message: 105
V/NetworkLayer(23796): TTL for received message: 4
I/flutter (23796): [NordicReport] received onDataReceived
V/DefaultNoOperationMessageState(23796): Acknowledgement payload: 00000003
V/BlockAcknowledgementMessage(23796): Segment 0 of 1 received by peer
V/BlockAcknowledgementMessage(23796): Segment 1 of 1 received by peer
D/DoozMeshStatusCallbacks(23796): onBlockAcknowledgementReceived
V/MeshManagerApi(23796): MeshNetwork pdu sent: 0x00194768EF39053AB8E96F3BD1CF16713C329C54EA78F30997
V/MeshManagerApi(23796): Received network pdu: 0x001991483AB5612B2FB31BE4E04AE0FD1E1B7FA0AA17BF6647
V/BaseMeshMessageHandler(23796): TTL for received message: 4
V/BaseMeshMessageHandler(23796): Sequence number of received access message: 106
V/NetworkLayer(23796): TTL for received message: 4
V/DefaultNoOperationMessageState(23796): Acknowledgement payload: 00000003
V/BlockAcknowledgementMessage(23796): Segment 0 of 1 received by peer
V/BlockAcknowledgementMessage(23796): Segment 1 of 1 received by peer
D/DoozMeshStatusCallbacks(23796): onBlockAcknowledgementReceived
logged all segments received by peer so why no Status ?
I notice that there are a lot of
E/DefaultNoOperationMessageState(23796): Decryption failed in NetworkLayer : mac check in CCM failed
D/DoozMeshStatusCallbacks(23796): onMessageDecryptionFailed
maybe it's the cause of the issue ?
In a mesh network you may receive messages that are not directed to you as they are being relayed, in the network layer we could find out if it was directed to us. There could be instances where you can receive such messages hence we drop it. So the segments you are referring are the once received by the device.
The decryption failure is not something I am able produce using our sample app at all. Have you done any changes? if the decryption fails yes the LowerTransportLayer will timeout eventually.
In a mesh network you may receive messages that are not directed to you as they are being relayed, in the network layer we could find out if it was directed to us. There could be instances where you can receive such messages hence we drop it. So the segments you are referring are the once received by the device.
Ok makes sense :)
The decryption failure is not something I am able produce using our sample app at all. Have you done any changes? if the decryption fails yes the LowerTransportLayer will timeout eventually.
I will try again with sample app and give you those logs if I have it again 👍🏼 but we only added our custom ApplicationMessage on top of 3.1.5 + new build flavor which fallback to your definition (specific flutter bug fix). So you don't have any issue to have ConfigModelPublicationSetStatus when it comes from non-proxy device ?
What about composition data log KO_2 and publication set log full_KO ? You never have this behavior ?
ok sounds good. I do not see any issues relating to CompositionDataGet or ConfigModelPublicationSet message sent via proxy or to proxy. Let me know how this goes for you.
Hello there. Sorry I didn't have time to look into this issue as we shipped force proxy as a temporary workaround on prod and I was busy with some other priorities! We may find some time to look for it. But first we will try out the new release 🕺
Hello ! After upgrading to latest release, we still have issues on segmented answers. It is behaving better tho ! I don't have anymore issues on CompositionDataGet, nor on our custom 2 segments ApplicationMessage. However, for ConfigModelPublicationSet, I still experience troubles to construct Status from segments. I think it has changed behavior and now, I always have these two logs when it fails:
D/DoozMeshStatusCallbacks(13381): onBlockAcknowledgementProcessed
E/DefaultNoOperationMessageState(13381): Decryption failed in NetworkLayer : mac check in CCM failed
D/DoozMeshStatusCallbacks(13381): onMessageDecryptionFailed
Even if later it has these two logs:
V/BlockAcknowledgementMessage(13381): Segment 0 of 1 received by peer
V/BlockAcknowledgementMessage(13381): Segment 1 of 1 received by peer
The success rate has increase so it is better, but for pubs it is still not enough and I can't revert our force proxy code workaround. I feel like our FIFO with auto-retry is messing with your "segment queue", because often, it is constructing the Status some seconds after our retries are done (3x publication set) when target device resend 1 of the segments.
It is still rly better, so thank for your work on low layers :)
hey again, no problem, does this happen on a custom firmware, could you share it? I am not able to reproduce this on our sdk.
I will ask our embedded team
@roshanrajaratnam is roshan.rajaratnam@nordicsemi.no a valid email to receive a Google Drive share link ?
android-devs@nordicsemi.no
I just sent the .hex file via Google Drive share
Thanks I just got it, I will take a look at it later during the day. What dk is this built for ?
This one :)
Hello ! Any update on the subject?
Hi, we're all on vacations right now. Please expect some delays in responses in July.
Hi there ! I see you are closing issues due to inactivity. This issue is still there, and I'm still awaiting support from you guys. I provided a .hex file in beginning of summer and still no feedback but this issue is labelled "awaiting user input" ?? Please notify me if you find any time to try to reproduce the issue with our firmware 😁
Hi, it was labelled awaiting user input long time ago before we got a response from you. I have been busy with some other projects. I will look in to yours ;)
@R0m4in-dooz could you share your fw again please? Can't seem to find it anymore!
indeed!
sent 👍
Hi couple of things here, I find the firmware a bit flaky. Sometimes it provisions and sometimes it doesn't if I disconnect after completing provisioning before going in to configuration. I see this happening only with the debug firmware that you had sent.
Also I noticed a decryption failure that you have pointed out earlier in our conversation which only seem to happen with your firmware. I have done multiple tests with both our example sdk and the debug firmware you sent and I don't see this happening against the Nordic sdk examples.
Although I notice there is an issue with the proxy node not sending the response/status to acknowledged messages. According to what I see on logs on my end, I get block acks that all segments are received and it stops there and I don't seem to receive the response over GATT via the proxy node. So this could be related to the issue you are facing. I would recommend creating a ticket on DevZone so that someone from the technical support could provide support on this case.
@R0m4in-dooz there was a fix made couple of days ago on #458, but this applies to all segmented messages relating to the sequence number getting reset. The decryption failure in your case has still been the same as pointed out earlier in our conversations. Please test it on your end as well just see if we are on the same page!
Hello ! Thank you for your time @roshanrajaratnam !
I find the firmware a bit flaky
It may be possible that the FW I sent is buggy, we made numerous updates since then 😅 (I sent you firmware from July)
Also I noticed a decryption failure that you have pointed out earlier in our conversation which only seem to happen with your firmware. I have done multiple tests with both our example sdk and the debug firmware you sent and I don't see this happening against the Nordic sdk examples.
🤔 ok then, if it is only with our firmware I will tell our embedded team to investigate
Although I notice there is an issue with the proxy node not sending the response/status to acknowledged messages.
Does this means that you indeed see a bug ? But you think it's only with our custom firmware ?
I would recommend creating a ticket on DevZone so that someone from the technical support could provide support on this case.
We recently upgraded the Mesh SDK and are in the process of validation now. I will test against these new firmwares and if still there I can indeed think to post a ticket in DevZone...
So for you, nothing is wrong with segmented messages handling in the current library version ? We are on 3.1.6 official release, maybe the dev branch is correcting some issues ? I should check commit history...
Please test it on your end as well just see if we are on the same page!
Sure, I'll try to find some time to make new tests against our updated firmwares and against dev
branch ASAP. Thanks again, I'll let you know how it goes 👍
@R0m4in-dooz was your embedded team able to resolve the issue of the proxy node not sending response messages? I am having the same issue using your flutter plugin for my mobile app and esp32's for nodes.
Hello @bslisowski ,
Please use our repository to post issues ! Well, unfortunately I am slowly forgotting our adventures on mesh network..if I remember correctly, no we never fixed that bug and could not figure out if it was a firmware problem or a mobile app problem.
Hi @roshanrajaratnam, i guess you can close this one, it was probably a problem on our side 👍
I was going to post an issue on your repository, but I don't think your plugin is the problem since it still persists when using the latest version of the nRF mesh app.
@bslisowski I would recommend creating an issue on d so that our tech support should be able to help you on this issue as this seems to be more or less related to the firmware!
I believe I found the issue - the placement of the bluetooth antenna on our PCB is interfering with the transmission. I was able to increase the transmission power and the problem is almost completely gone.
Awesome, I'll close this issue.
Describe the bug As title states, when sending segmented acknowledged messages to non-proxy devices, it is too often failing to construct back the Status.
To Reproduce Steps to reproduce the behavior:
CompositionDataGet
,ConfigModelPublicationSet
, or any msg that would result in segmented answer to the device that is not the proxyExpected behavior The library is able to construct the Status as soon as all feedback segments has been received (eg. SEG 0 == SEG N), and the non-proxy device should not have to retransmit at this point (?).
Platform details:
Logs / Screenshots I spent some time to build enough material for you. I hope it will be enough to spot any bug, or give us hint on why this behavior.
CompositionDataGet
: compositiondataget_OK.log compositiondataget_KO.log compositiondataget_KO_2.logConfigModelPublicationSet
: configmodelpublicationset_OK.log configmodelpublicationset_KO.log configmodelpublicationset_KO_2.log configmodelpublicationset_full_KO.logCustom
ApplicationMessage
akaMagicLevelSet
: custom_ApplicationMessage_OK.log custom_ApplicationMessage_KO.log custom_ApplicationMessage_KO_2.log custom_ApplicationMessage_full_KO.loglog files suffixed with
OK
--> expected behavior log files suffixed withKO
--> unexpected behavior: all segments received from device but no Status is being constructed. But we have it once the device is retransmitting some segments _log files suffixed withfull_KO
_ --> unexpected behavior: all segments received from device but no Status is being constructed. The device seems to have received every acknowledgements as it is not retransmittingUnfortunately, this behavior is even more problematic in our commercial application as we have implemented a FIFO for such msg with auto-retry mechanism. Meaning this kind of msg is in a FIFO, when sending a cmd, we wait for feedback and retry every 2.5 sec (retried 2 times). The success rate is below 15% because of this problem and it makes the app unusable for a lot of our features. If we increase the retry timer to 5-6sec, the success rate raises to ~60%, but it is still not enough, and is not acceptable for our customers. We may force proxy connection to raise success rate to 95%, but it is not acceptable for most features that need these segmented feedbacks, and moreover it's blocking the power of mesh n/w.