Open notthetup opened 8 months ago
@ettersi tagging you since you have brought this up a few times.
Thanks for bringing this up, @notthetup!
For context, off the top of my head I remember this particular instance where message staleness was an issue.
In an older version of the SWIS UI, we used FileListChangeNtf
s to monitor whether a preview creation request has completed. I assume this must have worked fine at some point, but by the time I started getting involved in SWIS, whatever it was that cleared the FileListChangeNtf
queue before we started listening for preview completion must have been removed and instead we had lots of stale FileListChangeNtf
s which triggered the preview completion monitor prematurely.
Providing a convenient API for listening for notifications that came in only after the receive
was submitted would have made this bug somewhat easier to fix, but it wouldn't necessarily have prevented it because for that the original programmer would have had to be very careful to use exactly the right API and if they got it wrong then the feedback would have come in only after they had moved on to making other changes. Therefore, while I agree that there's some discussion to be had around triggering on stale messages, to me the much bigger issue is that the Fjåge message queue represents global mutable state.
Without any API change, one could easily filter based on sentAt
field, if the platforms can be assumed to be on the same hardware or time synchronized.
t0 = currentTimeMillis()
msg = receive({ msg -> msg.sentAt >= t0 })
platforms can be assumed to be on the same hardware or time-synchronized
I think that's not the case with the example that @ettersi talked about.
But this makes me think of a useful way to solve this to have a receivedAt
timestamp on a message which is set by the Gateway or Agent when the message is received?
I was thinking along the same lines...
We can add a message-specific flush:
function flush(gw::Gateway, ::Type{T}) where {T <: Message}
while true
if isnothing(receive(gw, T))
break
end
end
end
Whenever we're receiving a notification, we're already potentially stealing it from someone else, so it's probably fine to steal all of them while we're at it...
The various Agent/Gatway
receive
APIs have a default semantic, that if a message which matches the filter is already in the receive queue, then it's returned immediately.While this is a great default in light of all the async and co-operative multi-threading-based processing that is done in Fjåge, in certain situations it may lead to requiring lots of complicated code to be able to receive a new message that matches a given filter.
For example, a common use case could be to send a
Req
to an Agent and wait for aNtf
which the Agent sends out after some internal processing. Just doing aagentid.receive(MyNtf)
may give a stale notification which might have been in the receive queue.. Doing aagent.queue.flush()
(supported in some Gateways) is 1. not a standard and 2. can cause other behaviours or parts of the Agent to lose some messages.So the question is do we need some mechanism to simply tell the
receive
API call that we are only interested in the newer message received since we invoked the API.@ettersi tagging you since you have brought this up a few times.