Closed danak6jq closed 2 years ago
These code parts are taken directly from Semtechs LoRaMAC stack V1.0.2, I did not change things there because I do not know much about it and why and how some things are handled there.
I had a look at Semtech reference LoRaMac v1.0.2 and v1.0.3 and those releases both go ahead and send an empty frame to flush MAC commands that make fOpts too large - I believe this must have been added by RAK then:
// cannot send return LMH_ERROR;
It has a different behavior, as described above.
In fact, it doesn't appear this behavior has changed in the reference since it was committed:
https://github.com/Lora-net/LoRaMac-node/blame/master/src/apps/LoRaMac/common/LmHandler/LmHandler.c
It does present a bit of a challenge for an application that assumes OK means the payload was sent. If an application wants to do re-send of the payload, it needs to know that an empty was sent rather than just "OK", and this is the change apparently made by RAK. It does mean that the app needs to send the empty payload itself, until it can successfully send the desired payload (sending an empty payload can still get MAC commands in return, so it can take more than one try)
Additional testing suggests that this LoRaWAN stack doesn't send the MAC commands when sending an empty, but does flush them. Will investigate further; there's clearly something amiss here.
The scenario is US915 DR0, with 2x LinkAddrAns MAC commands and 1x DevStatusAns MAC command were received on the last DL. An attempt to send an UL fails because the combined fOpts + payload are too large. This returns an error to the application code, which then sends an empty payload intending to send up the MAC commands.
However, this empty payload UL has no commands in fOpts or in the payload. Apparently the outstanding MAC commands have been discarded by the time I send the empty frame to flush them to the LNS.
Retrying the original UL succeeds, but the MAC commands are never sent.
I suspect this code in LoRaMacQueryTxPossible() is executed on the empty-frame send:
{
txInfo->MaxPossiblePayload = txInfo->CurrentPayloadSize;
// The fOpts don't fit into the maximum payload. Omit the MAC commands to
// ensure that another uplink is possible.
fOptLen = 0;
MacCommandsBufferIndex = 0;
MacCommandsBufferToRepeatIndex = 0;
}
which suggests MacCommandsBufferIndex + MacCommandsBufferToRepeatIndex is larger than the max payload size (11).
This code then returns OK:
// Verify if the fOpts and the payload fit into the maximum payload
if (ValidatePayloadLength(size, datarate, fOptLen) == false)
{
LOG_LIB("LM", "LoRaMacQueryTxPossible -> ValidatePayloadLength failed size = %d DR = %d", size, datarate);
return LORAMAC_STATUS_LENGTH_ERROR;
}
return LORAMAC_STATUS_OK;
because size and fOptlen are both zero. The MAC commands are discarded.
(I don't know how the call to ValidatePayloadLength() is ever valid, because fOptlen is always 0 at this call and neither of txInfo->MaxPossiblePayload or txInfo->CurrentPayloadSize is used, and size is just the application payload size. )
It doesn't seem like fOptLen here should be larger than the max payload size for DR0, but it is a sum of two different MAC command buffers, possibly there's duplication there.
Something is clearly not right here, and this code bears little similarity to the Semtech reference for 1.0.3
I strongly believe at no point should LoRaMacQueryTxPossible() discard MAC commands; subsequent ULs should send as much of the stored MAC commands as possible, which seems to be what the reference code does.
In more detail. Here's the first UL from a mote after JOIN:
Which the LNS responds to with:
Note there are 2 LinkAdrReqs (to set the channel mask and DR) and a DevStatusReq.
Two hours later, the mote wants to do a confirmed UL, but lmh_send() returns an error because payload + fOpts is too large, so the mote sends an empty payload UL (note that it's at DR0 as the earlier AdrReq commanded):
However, this UL contains no MAC commands, either in fOpts or in the payload (as expected). We expect this frame to have the DevStatusAns and Ack for LinkAdrReq, but they are missing. So the LNS not surprisingly repeats the MAC commands (DevStatuReq is requested once per hour for testing this scenario):
The mote attempts to retry the confirmed UL, but again lmh_send() fails because the LNS sent down the three MAC commands again, the combination of which is again too large, so another empty UL is sent, again without MAC commands:
LNS responds with yet another AdrReq commands, because none of them have been Ack'ed yet:
Because the DevStatusReq is absent, this time the retry of the Confirmed UL succeeds:
Which contains Acks for the LinkAdrReqs. Progress!
Note: I suspect the LoRaWAN stack can send only one LinkAdrAck instead of the two, which might avoid this issue in the first place.
The LoRaWAN stack is discarding MAC responses when there's payload size issue - it appears it may actually happen when the empty payload is sent.
So do you have actually a problem or are you just analyzing the LoRaWAN stack?
As if I take pleasure in analyzing the LoRaWAN stack.
Yes, I have a problem, though I suppose now that I understand it, I can work-around it for the most part. This probably ought to be documented somewhere.
However, motes discarding MAC commands means that gateways will unnecessarily repeat DLs and increase time spent in DLs (transmitting, not listening, and thus not listening for ULs). Given this is most likely to happen in DR0 just increases the amount of time that GWs will be transmitting and not receiving.
So this is a general issue and appears to be in LoRaMac code provided by RAK. Feel free to prioritize it as you see fit.
Not sure where you get all of this. But I think you are wrong in one point. If the payload is too big for the DR, there is no 0 packet sent.
if (LoRaMacQueryTxPossible(app_data->buffsize, &txInfo) != LORAMAC_STATUS_OK)
{
LOG_LIB("LMH", "lmh_send -> LoRaMacQueryTxPossible failed");
// Send empty frame in order to flush MAC commands
mcpsReq.Type = MCPS_UNCONFIRMED;
mcpsReq.Req.Unconfirmed.fBuffer = NULL;
mcpsReq.Req.Unconfirmed.fBufferSize = 0;
mcpsReq.Req.Unconfirmed.Datarate = m_param.tx_data_rate;
// cannot send
return LMH_ERROR;
}
returns with an error without initiating the sending. So there is no 0 length packet sent.
Not sure where you get all of this. But I think you are wrong in one point. If the payload is too big for the DR, there is no 0 packet sent.
Of course there is a 0 packet sent - see the ChirpStack logs above captured above.
if (LoRaMacQueryTxPossible(app_data->buffsize, &txInfo) != LORAMAC_STATUS_OK) { LOG_LIB("LMH", "lmh_send -> LoRaMacQueryTxPossible failed"); // Send empty frame in order to flush MAC commands mcpsReq.Type = MCPS_UNCONFIRMED; mcpsReq.Req.Unconfirmed.fBuffer = NULL; mcpsReq.Req.Unconfirmed.fBufferSize = 0; mcpsReq.Req.Unconfirmed.Datarate = m_param.tx_data_rate; // cannot send return LMH_ERROR; }
returns with an error without initiating the sending. So there is no 0 length packet sent.
Once I realized the too-big frame results in the error return you show here with no frame being sent, I added the attempted recovery to the application (explicitly send a 0 packet in hopes that will flush out the MAC commands).
Explicitly sending a 0 frame does send a 0 frame - but it discards the MAC commands before doing so. That's surely a bug - the reference stack, at least the 1.0.3 version, implicitly sends the 0 frame with the MAC commands intact, but returns no error from the send so the application doesn't appear to have any idea the intended payload was not sent. That bit of code above there that returns LMH_ERROR appears to an RAK change to the Semtech code.
Since I've already deployed a few dozen RAK4631-based motes and have 5 dozen more in the shop, perhaps I should contact someone at RAK - any suggestion who?
Any update on what a professional/volume deployment of RAK Wireless hardware should be using? Is RAK3172 the suggested alternative for actual product deployment?
inquiry@rakwireless.com brings you in contact with the sales team.
From LoRaMacHelper.cpp: lmh_send() :
This digresses from Semtech reference behavior, which would go ahead and send the empty frame to flush the MAC commands that make the frame too large to send (assuming fOpts isn't too large later on).
I believe I understand the issue here: the reference stack will return OK, so the calling code will believe it has sent the payload, while the stack will send an empty frame. The calling code has no idea.
My question: this change was intentional so calling code could see the error, attempt to send an empty frame, then send the desired payload?