Closed adriansmares closed 3 years ago
- During the creation of application packages frontend, create a subscription for each package.
[...]
The advantage of this approach is that one application package cannot make others miss traffic. It also parallelizes the processing.
Then there should be parallelization in the package too, right? Otherwise one application performing long DMS calls can still stall other applications using DMS too.
- We fan out the uplink to each application package subscription.
If the fanout is a channel send? Is that waited upon with a timeout, and then also parallel and are all channel sends awaited upon to avoid a goroutine blow up?
Then there should be parallelization in the package too, right? Otherwise one application performing long DMS calls can still stall other applications using DMS too.
We can do that indeed. The reason why I didn't mention it was because packages can modify the state of the device / application and it would mean they race by default. Nonetheless, we don't have any that do this, so it should be fine.
We can also make the distribution stable (i.e. having a subscription per worker, and always hashing the same end device UID to the same task).
If the fanout is a channel send? Is that waited upon with a timeout, and then also parallel and are all channel sends awaited upon to avoid a goroutine blow up?
PublishUp
for io.Subscription
does not wait - if the channel is full it just returns an error. So it's always constant time to push messages.
Summary
The current application packages traffic processing pipeline is strictly speaking serial and this can lead to lost traffic on stalls.
Why do we need this?
In order to avoid losing traffic when certain packages stall the traffic.
What is already there? What do you see now?
The current pipeline works as follows:
applicationpackages
local subscription.What is missing? What do you want to see?
If a certain handler stalls (for example, one that does HTTP requests that may timeout on a network partition), the subscription channel will fill up and stop accepting new uplinks. The follow up on this is that traffic is lost for all packages, if a single one stalls.
Environment
v3.11.1
, but the pipeline has been working like this since the beginning.How do you propose to implement this?
I propose that the pipeline is parallelized in the following manner:
Initialization:
Runtime:
applicationpackages
local subscription (as it was before)The advantage of this approach is that one application package cannot make others miss traffic. It also parallelizes the processing.
One detail that we have to take into account is that the packages themselves receive more than just the
io.ContextualApplicationUp
message - they also receive the current state of the association.My proposal would be to change
io.Subscription
to work (i.e. to use in the channel) with the following interface:io.ContextualApplicationUp
can implement this interface by default, making the required change minimal. But theapplicationpackages
frontend can use an extended message type that will also contain the package state, to be delivered to each package handler.How do you propose to test this?
Set timeout to infinite in the LoRaCloud DMS
http.Client
, set the remote server to a slow loris, start pumping traffic. If thebuffer_full
error occurs only for the LoRaCloud DMS frontend, while the other still process and receive their traffic, everything is fine.Can you do this yourself and submit a Pull Request?
Yes. cc @johanstokking
This issue does not introduce breaking changes and may be delivered in a patch release as well. I'm just choosing
v3.12
here since I can't pick something likev3.11.4
or.5
.