signalapp / Signal-Android

A private messenger for Android.
https://signal.org
GNU Affero General Public License v3.0
25.18k stars 6.06k forks source link

Delayed/missing push messages #970

Closed fd0 closed 8 years ago

fd0 commented 10 years ago

When messaging a friend of mine via push, most messages are delayed a few hours. Background data is enabled. Sometimes messages are even not delivered at all or arrive in different order than sent. This also happens with messages other people send him, although I can exchange messages with these people without problems (delivered within a few seconds). He runs a Nexus 5 with the latest stock Google Android and is perfectly reachable via Hangouts the whole time.

Any idea how to debug this?

lablans commented 10 years ago

I can confirm this problem, see #937. Please correct me: GCM works on a "best effort" basis. Messages can arrive at any time in any order. And it seems that this is not just an academic problem, it's actually happening.

We need some "server notices that a message has not been delivered and resend it" mechanism. Maybe possible with the delivery reports (#957)? Resending means, however, that the client would need a mechanism to deal with these duplicate messages (those would be two GCM messages; it would need to detect the duplicate content) - but he needs that anyway, see #937.

fd0 commented 10 years ago

Issue #1029 may also be caused by this.

Certhas commented 10 years ago

Edit: This was a user error. leaving the following up for reference.

I am having this issue as well, Alice's messages go through immediately to Bob, but Bobs messages get delayed/don't arrive (been testing for two days/a few hours), though they are shown as sent on Bobs phone. This happens with both phones on WiFi, and with both phones on mobile data.

This also caused a confusion of encrypted sessions that are ended "simultaneously" (during the delay), which led to lots of "bad encrypted message" errors, and the inability to reestablish a secure session.

This could only be solved by reinstalling the app. After reinstalling the delayed messages were delivered immediately (but as "bad encrypted messages" obviously).

Devices are Nexus 4 and Nexus 5 with stock Android.

This makes TextSecure push completely broken. Multi hour delays/delivery failures, while both parties are online, without the sender being notified is just the worst kind of broken for a communication program.

rbieb commented 10 years ago

1038 Might be related.

joshproehl commented 10 years ago

Having exactly the same problem as Certhas, for multiple days now. Everything works fine when forced to SMS-only, so the problem appears to be in the transport, rather than the app or the phone. Happy to provide any debugging, logs, etc if needed.

behrmann commented 10 years ago

I too can confirm this, running on a HTC Desire Z with Android 2.3.3 and Textsecure 2.0.4

Messages arrive instantenously from Alice to Bob, but do not seem to arrive when send from Bob to Alice even though they are shown as sent. The initial key exchange works fine and also the exchange of encrypted messages via SMS works fine.

I would very much like to see this problem solved, since it prohibits me from spreading Textsecure among my friends.

rbieb commented 10 years ago

Is the background of your messages blue also? It seems to be green if the mesage was actually received, but mine all have a blue background.

behrmann commented 10 years ago

@Nordic89: Green messages are sent via SMS and blue messages are sent via push. As I described, everything works for SMS text messages, but push messages (the blue ones) only arrive immediately when sent from Alice to Bob and not when sent from Bob to Alice.

Certhas commented 10 years ago

So in my case it turned out to be a user error. Alice had restrict background data on the google services app checked (the tricky bit being that it still transfers some data and thus doesn't actually display as restricted in the list of apps). Alice had not been using any chat applications but was relying on SMS before.

While this is a user error, it's fairly non-obvious. Some online tutorials online suggest doing this for privacy/data/battery reasons (as google services has been known to be buggy at times). Alice presumably did this without being aware that google services would have anything to do with delivering messages in a superficially unrelated SMS replacement app.

I don't know if this is technically possible, but it would be great if text secure could pop up a warning box if it detects that background services are restricted.

rbieb commented 10 years ago

@behrmann Ah, that makes sense. My green messages always arrive and the blue ones don't, so it's definitely a problem with push messages.

moxie0 commented 10 years ago

@Certhas I'm not sure if that's possible to detect, but it'd be cool if you did some research to see whether there's an API for that and reported back.

wickedshimmy commented 10 years ago

The recommendation from ICS onward seems to be to use something like the following if you're running in the foreground or the background, but that's not really helpful unless it's TS code running. AFAICT it will take into account the global "Restrict background data" state, but I can't find any way to query it for a specific other app (like the @Certhas case of Google services being explicitly disallowed from using background data).

ConnectivityManager connectivityService = Context.getSystemService(Context.CONNECTIVITY_SERVICE);
NetworkInfo network = connectivityService.getNetworkActivityInfo();
if (network != null && network.isAvailable()) { // or network.isConnected() here
    // background data should not be restricted
}

I wonder if this is another thing that can just be addressed by better messaging around the push transport (you need to have Play services/GCM running and able to access the network in the background, etc.).

fd0 commented 10 years ago

Today it happened again to me: Yesterday evening I sent a push-messages to a colleague while his phone had no data connection. It was delivered about 10 hours later this morning.

I'm curious now: What service level does GCM offer? For example, does it guarantee delivery? What about caching, why does it take so long to deliver a message after a device regained network connectivity?

tinloaf commented 10 years ago

@wickedshimmy The problem is that I don't see how to differentiate between "no internet connection" and "background data restricted" with that code. Do you know if any of these checks explicitly only fails if background data is restricted?

wickedshimmy commented 10 years ago

No, that was sort of my point -- since ICS you can't. If you query the network state, you can get yes/no, but not why. Even the detailed info doesn't expose that setting on its own. That would work for either foreground or background services (if I can't send because the network isn't available, alert), but the problem specifically here is that the background play services isn't running any TS code, so there's no pace to make that check and warn what happened, and for cases like the above the availability can differ from app to app even with the same network info and permissions, so checking this ahead of time or from TS is also no good. There might be another API to use, but I don't think this one will work. Sorry for being unclear!

behrmann commented 10 years ago

Is it confirmed now, that the Problem is with restricted background data? I just wonder, because I habe not restricted background data and habe both been the party to not receive PUSH messages and to send PUSH messages that have not been received (with two different communication partners).

joshproehl commented 10 years ago

I have confirmed that neither of the devices I'm having trouble with have restricted background data enabled. The one that is having the most trouble sending/receiving does wander through areas of reduced cell service, but still fails to send messages that are created once it's back in a good service area until much later, all with no indication to the sending user that the message has not been delivered.

rbieb commented 10 years ago

I don't have background data restricted, yet still can't send any messages.

JavaJens commented 10 years ago

@Nordic89 can't you sent messages altogether or are they delayed heavily?

tinloaf commented 10 years ago

By the way: Restricting background data should never interfere with sending messages. If you can't send messages, something else is wrong...

rbieb commented 10 years ago

@JavaJens Sms work just fine, I can't send any messages whatsoever by using data. They are not just delayed, they never arrive. I receive messages just fine.

behrmann commented 10 years ago

@JavaJens: Sending never seems to be the problem (at least not in the cases I witnessed), but very late delivery/no delivery of messages is and the latter always seems to be a one-sided problem.

JavaJens commented 10 years ago

But it his comment @Nordic89 clearly said "I can't send any messages", thats why I was asking. Because these should be unrelated. The idea behind my question was, is something wrong with Nordics setup, e.g. do you get any errors leading to your assumption that sending doesn't work or are just all of your recipients not receiving. @tinloaf hence my question :) Maybe I didn't express my self clearly enough, sorry.

rbieb commented 10 years ago

@JavaJens That's right. I can not send any messages (via push, that is). I'm registered at the server and I don't personally see anything wrong when I try to send a message, but they just don't ever arrive.

JavaJens commented 10 years ago

Could you provide a debug log? Maybe we can see an error message in the logs.

rbieb commented 10 years ago

Don't know if that helps. http://hastebin.com/losawogiji I just tried to send another message, but I don't exactly know what the log logs, so it might not be included. If you need anything else just tell me what to do.

fd0 commented 10 years ago

For the record: The people who experience this behaviour with messages sent by me all have background data enabled and data connection for hours before the message is delivered.

Certhas commented 10 years ago

@fd0, does hangout chat behave the same way, or is this specific to textsecure?

phrag commented 10 years ago

I am also experiencing this issue frequently, to the point of not using the app any more because i cannot guarantee my messages are received on time. Messages are sometimes delivered hours later. Only experienced this with one recipient, but it has been happening for the past 5 days. Tried deleting conversation, restarting encryption, and refreshing push directory.

Update: problem was with my ROM's GCM service not TextSecure. Apologies for the confusion.

tinloaf commented 10 years ago

Did this recipient verify that other GCM apps work fine? Google Hangouts or the like?

rbieb commented 10 years ago

Me and my contacts have been using Hangouts just fine. Error only happens with textsecure. Whatsapp works as well.

fd0 commented 10 years ago

I can confirm that Hangout works very well for this specific contact, it is his main communication channel at the moment.

moxie0 commented 10 years ago

@phrag FWIW this is an issue with GCM, so nothing you do in the TextSecure client will improve the situation. For some reason, GCM messages aren't getting to your device. We need to figure out why.

bungabunga commented 10 years ago

exactly the same issue here - a few hours delay of push messages.

phrag commented 10 years ago

@moxie0 indeed this seems like an issue with my GCM service and not with TS... i was not using any other GCM service so could not identify it easily. Apologies for getting the two mixed up. A delivery verification feature would be great to combat this issue. For the record i was running CM-10.1 snapshot that seemed to have an issue with connection on wifi and 4g, i've since rolled back and it appears to have fixed the problem. Thanks for the reply and feedback, much appreciated =)

hughobrien commented 10 years ago

I've also seen some delays of a few hours, though I think it was due to the Google connection being disconnected for an unknown reason. Could TextSecure provide a means of alerting a user that it hasn't been in contact with GCM for a period of time? A 'force resync' or regular service tick perhaps?

lablans commented 10 years ago

Just an observation of one cause of delayed push messages: Android seems to turn off GCM when the free space on /data is below a certain threshold. This seems to be a safeguard not to run out of memory. You also won't receive gmail, SMS, etc. Once you make room, TextSecure receives the messages immediately within a few seconds.

Please notice: "Storage space" refers to the (small) data partition, not the (large) sd card. This is easy to confuse on older devices. So you can't "make room" by deleting mp3s or photos, but have to uninstall apps.

moxie0 commented 10 years ago

Can anyone experiencing this problem try installing an app like this on the device that is receiving delayed messages: https://play.google.com/store/apps/details?id=com.andqlimax.pushfixer.noroot

We can replicate it's functionality if it helps with the problem.

bungabunga commented 10 years ago

great! will send this link to a contact i have push problems with and do some testing.

eikowagenknecht commented 10 years ago

I have kind of the same problem here. Just had a conversation with someone over TS where almost every second message did not arrive or arrived way later (~10min?). This happened both ways, so I did not receive his messages in time and he did not receive mine in time, which became very confusing after a while when half of the messages that arrived were from about ten minutes ago.

@moxie0 I installed the app you mentioned and it seems like it helps a bit and reduces the time until the previously not transmitted messages do arrive.

What still is very confusing, though, is that the messages appear in an order that can only be described as random and that the timestamp shows the time when the message is received, so I don't even know which messages are late.

I think a "message delivered" checkmark and a timestamp when the message was sent would be good countermeasures, along with some means to display delayed messages correctly in the UI (between the other ones, maybe with an additional highlight to show that the messages came in way too late).

lablans commented 10 years ago

@phenx-de please see #937.

joshproehl commented 10 years ago

Confirmed that other services (hangouts) experienced no delays in messaging. However after the last TextSecure update I re-enabled TextSecure push notifications on both devices though, and have not experienced any delayed messages.

gartenriese2 commented 10 years ago

I can't receive any push messages either. SMS work just fine and I can send push messages, too. WhatsApp works, too, so it shouldn't be a problem with my phone?

gartenriese2 commented 10 years ago

Nevermind, it was because of a Titanium Backup. See https://github.com/WhisperSystems/TextSecure/issues/1167

bungabunga commented 10 years ago

@moxie0, we tested the solution you proposed earlier (pushfixer app) with a contact of mine that had problems with delayed messages. she seems to receive push messages accurately in the last 11 days (after installing the app). anyway, i can not do a really heavy testing with this contact, so testing experience from others are desired too.

tinloaf commented 10 years ago

What does the pushfixer do? I guess it somehow triggers some server to send an empty GCM message it itself at given intervals, right? Implementing this ourselves would of course drain the battery a bit (depending on the interval), and also requires some changes in the TS servers...

rbieb commented 10 years ago

The whole point of using GCM is that textsecure doesn't have to deal with stuff like that, it really should work without any changes.

tinloaf commented 10 years ago

It should, but if it does not, there's no use saying "it should". ;) Also, sending a heartbeat (as we all know, crypto-software loves heartbeats) to all the devices that enabled the "fix my push" option every 30 seconds or so is a lot easier than building something like GCM yourself. GCM can actually broadcast messages to a lot of recipients with a single request to Google's servers, so this should not be too resource-hungry on the TS server side..

behrmann commented 10 years ago

@moxie0: I tested the pushfixer app (so far only with the default heartbeat interval of that app) and for me it does not seem to work. A friend of mine and I both installed the push fixer and he sent me two messages, neither of which I have received, yet, after waiting for more than 24 hours.

ykarm commented 10 years ago

How about using a similar method to what Threema uses (I know, I know, I'm sure you're all fed up of hearing "Threema", but isn't this the benefit of having competition)?

Instead of sending the entire message by GCM, how about sending the message to the TS server, and sending via GCM merely a notice to the receiver's device that there is a new message? Using this method, the message will only be delayed by the amount of time that the user has not connected to TS server. If the user receives no GCM notice for one particular message, the message will be received the next time the device asks the TS if any new messages have arrived, which may be the next time that they open the app, or when some later notice via GCM arrives etc.

Clearly GCM is a good option in most cases, but seems to be fairly unreliable in others, surely this is a good compromise?

Also: this way, it would also be possible to send messages via data completely independently from GCM, as a user could decide to "refresh" manually, or set up their own "heartbeat" (eg every 30 mins) as you call it, in the options menu.

Is there a reason why the whole messages are sent via GCM? It seems to place a lot of trust in the service...