TheThingsNetwork / lorawan-stack

The Things Stack, an Open Source LoRaWAN Network Server
https://www.thethingsindustries.com/stack/
Apache License 2.0
936 stars 302 forks source link

Send Tx events upstream #76

Closed johanstokking closed 3 years ago

johanstokking commented 5 years ago

Summary:

GS can report TX acknowledgement of a specific ttnpb.DownlinkMessage to the NS, for the purpose of improved observability.

What is already there? What do you see now?

TX acknowledgement now used to signal presence of JIT queue, nothing else.

What is missing? What do you want to see?

johanstokking commented 3 years ago

@neoaggelos @rvolosatovs I updated the original comment based on progressive insight and what we already implemented in the meantime.

neoaggelos commented 3 years ago

After looking a bit into this today, some considerations:

neoaggelos commented 3 years ago

Some initial work is done in https://github.com/TheThingsNetwork/lorawan-stack/tree/issue/76-send-txevents-upstream

johanstokking commented 3 years ago
  • Downlinks may be scheduled by the NS, or the PBA. Is there any use in forwarding TxAcknowledgements for downlinks scheduled by the PBA?

At the moment, no.

In any case, how can the GS recognize who scheduled the downlink? A simple solution I can think of is to extend the *ttnpb.DownlinkMessage with a ReportTxAcknowledgement boolean field, and have the NS set that to true when scheduling.

I don't understand, please clarify.

It is not mandatory for GS to report the TX acknowledgment back to NS. NS wouldn't also wait for it, it's just informative.

  • Currently, io.DownlinkTokens and *ttnpb.TxAcknowledgement only store the correlation IDs of the downlink message. I gather that they should be extended to include the *ttnpb.DownlinkMessage that was scheduled, right? In order to avoid sending the correlation IDs twice (as CorrelationIDs and DownlinkMessage.CorrelationIDs fields, can we deprecate the former since we most likely bumping minor for the API change anyway?

Yeah, I agree. However, ttnpb.TxAcknowledgment is gateway facing, and this breaks rule number 1 of the compatibility commitment, so you cannot change that API like you did in the feature branch. Also, gRPC and MQTT V3 gateways do not use downlink tokens like UDP and LBS, hence the correlation IDs are necessary to relate the TX ack to the downlink message. We cannot expect these gateways to the entire downlink message back.

So my suggestion would be to:

Note that we write "acknowledgment" (without "e"), as we use US English spelling throughout the codebase.

neoaggelos commented 3 years ago

In any case, how can the GS recognize who scheduled the downlink? A simple solution I can think of is to extend the *ttnpb.DownlinkMessage with a ReportTxAcknowledgement boolean field, and have the NS set that to true when scheduling.

I don't understand, please clarify.

So, GS receives a scheduling request, downlink is sent, and TxAck for that is received by the gateway. How does the GS know who scheduled the downlink (NS or PBA)? Is it enough to retrieve the Network Server peer with the device identifiers, and simply ignore the errors if no peer is found?

johanstokking commented 3 years ago

Ah, right. I don't think this belongs in DownlinkMessage but you might want to save state in GS, with the downlink token.

Even if we had a ReportTxAcknowledgment, and we might add support for this in PBA (which is not unlikely), we have a problem. We shouldn't make this too implicit like "PBA never sets TxAcknowledgment so it must be reported to NS"

neoaggelos commented 3 years ago

OK, I think the proposed solution here is not sufficient for NS to also handle #56, since NS omits fields from the DownlinkMessage that is sent to the Gateway Server (e.g. end device identifiers).

So, I think it is best to keep all this information on the NS side instead: Have NS store scheduled downlinks temporarily. When GS reports a TxAcknowledgment, NS checks if there is a matching scheduled downlink. If there is no match, skip (maybe print a warning). Otherwise, retrieve the complete downlink message, merge the TxSettings reported by the GS and then take care of forwarding to the AS.

As far as the list of scheduled downlinks stored by the NS:

This also solves the problem of figuring out the source that scheduled the downlink, we can simply rely on the GS forwarding table.

rvolosatovs commented 3 years ago
* We can match using (a) the correlation IDs (b) an atomically increasing counter (stored in Redis) or (c) a hash generated from the message. I prefer (b), (c) or something equivalent.

Can you clarify how such a counter would work? How is it communicated from NS to GS and to NS from GS? I'd much rather see if we can use the available information before adding new things. Overall payload hash combined with correlation IDs seems to be sufficient to reliably match the downlink.

neoaggelos commented 3 years ago

Closed with #3773