Closed DMRobertson closed 2 years ago
I saw this pop up in my e-mails.
It doesn't make sense for someone to send a push notification to zero pushkeys but I guess someone has accidentally done that anyway.
It would probably be sensible to catch this case and either 200 it without doing anything (OK! I've done nothing as you requested) or 400 Bad Request it.
Technically the spec doesn't say that there should be at least 1 device, but the call doesn't make any sense if you don't specify a device, so either option seems reasonable to me (though perhaps 400 Bad Request is a clearer error for when an implementation gets it wrong and is trying to figure out why push doesn't work).
Assuming that the pushkey is part of the HTTP request, I agree that 400 "You messed up" is the right response here.
But having taken another look I'm not sure how we actually end up in this situation. My reading: for each device d in notif.devices
we check that exactly one pushkin exists matching its the devices's app_id
.
We then dispatch to the single device d
via dispatch_notification
and _dispatch_notification_unlimited
. By the time we're in the latter function, I think it should always be the case that the device
argument to _dispatch_notification_unlimited
should always satisfy device.app_id == self.name
. And so I think this list comprehension should never be empty?
If we are dispatching to just a single device, I don't understand why we are even looking at n.devices
in the first place (and I couldn't see other pushkins doing something similar). Can't we just use the single pushkey device.pushkey
, made into a list of length 1 if needbe?
By the time we're in the latter function, I think it should always be the case that the device argument to _dispatch_notification_unlimited should always satisfy device.app_id == self.name.
Hmm. Is this assumption correct? I suppose it's possible that the pushkin's name contained a *
, and so so the appid wasn't an exact match but a glob/regex match:
If we are dispatching to just a single device, I don't understand why we are even looking at
n.devices
in the first place (and I couldn't see other pushkins doing something similar). Can't we just use the single pushkeydevice.pushkey
, made into a list of length 1 if needbe?
I believe the protocol allows you to send multiple devices at once to the push server, but Synapse doesn't do this. Sygnal seems to support it (or try to, it probably doesn't get much testing since Synapse doesn't support it)!
I think you've spotted something inconsistent here.
For context, the GCM pushkin used to (still does?) pull out all devices that match its app ID, and it would deliver them to GCM in one go.
It looks like we have since introduced app ID globbing. (I was not aware that this had been added.) It looks like that may be the cause of this problem (because it interacts badly with the GCM pushkin's attempt to deliver all at once, assuming we use it (and somehow didn't notice that it doesn't work — ?).
In this particular case, the device's app ID is com.im.vector.app.android
, while the pushkin's app ID is im.vector.app.android
.
There are a few things that aren't quite right here:
find_pushkins
treats .
s in pushkin app IDs as wildcards instead of literal .
s. sygnal.utils.glob_to_regex
looks handy.find_pushkins
matches any substring of the device app ID, rather than the whole app ID
com.im.vector.app.android
app ID wouldn't work?
com.im.vector.app.android
app ID currently don't work anyway!com.im.vector.app.android
come from? It's im.vector.app.android
in the current Element Android app.GcmPushkin._dispatch_notification_unlimited
is not glob-aware, as @reivilibre points out. sygnal.utils.glob_to_regex
looks handy.GcmPushkin
tries to do a batch send from within a non-batch API seems quite hacky? Perhaps some API redesign/refactoring is needed?Fixed by #281, #269 and #270.
(internal users only:) https://sentry.matrix.org/sentry/sygnal-matrixorg/issues/201296/?query=is%3Aunresolved
The list comprehension that builds
pushkeys
might produce an empty list, in which case we get the index error as reported on sentry.This logic seems to have been here since eeea1a71a1aa70e629a05f5d85e6f7d7c3406c7a in Fed 2015. The first sentry report is from Mar 6, 2021.
Questions: