signalapp / Signal-Android

A private messenger for Android.
https://signal.org
GNU Affero General Public License v3.0
25.59k stars 6.14k forks source link

Unable-to-connect error when re-registering after factory reset #8148

Closed five-c-d closed 6 years ago

five-c-d commented 6 years ago

Bug description

Having difficulty re-registering signalapp with a phone that was recently factory-reset.

Steps to reproduce

Actual result: Mixture of "unable to connect" errors, or simply failing to receive the SMS-or-robocall with the confirmation code. Sometimes get prompted to input the six-digit-code by signalapp, sometimes do not get that far in signalapp UI/UX.

Expected result: Should not get "unable to connect" error message, should receive SMS conf-code, should make it all the way through the steps, registration should succeed properly.

Screenshots

n/a... might be able to provide them if really necessary

Device info

Device: Samsung S9+ (Model SM-965F) Android version: 8.0 stock ROM with Samsung Experience v9.0 and Android security patch level 1st August 2018 (latest available for handset) Signal version: 4.24.8

Link to debug log

n/a because the handset cannot register with signal-server ... MIGHT be able to try the adb logcat thing if REALLY necessary?

Additional potentially-relevant info

From searching, these issue-reports might be relevant:

These issue-reports are NOT likely to be relevant to this specific case:

greyson-signal commented 6 years ago

Unfortunately this is a case where a log would be super helpful (I feel like we should make log submission available somehow during registration...). If you could get the adb logcat logs after a registration failure, we should be able to diagnose what kind of error is happening here. Thanks!

five-c-d commented 6 years ago

@greyson-signal , thanks for your reply... and yes, it is pretty silly not to be able to submit debug-logs from the portion of the process where signalapp is relying heavily upon third party providers (GCM and Twilio or whatever it is now) without going through the effort to get USB debugging configured and capture the info with a programmer's tool from a laptop. Special software required, non-phone hardware required, painfully difficult steps required, no helpdocs oriented towards the masses?

Most endusers are going to simply abandon the registration process when they hit a registration-bug like this... and there is no way to distinguish actual bugs from people that just changed their mind once they found out signalapp was demanding their phone number so it could robocall them.

Can you provide me a pointer to a super-straightforward step-by-step tutorial, aimed at the masses of everyday users out there, which would walk them through how to install adb and capture a logcat? Maybe on the signal wiki somewhere? This is the best I could find...

Even taking the advice of Moxie0, there was not much helpful to the Typical WhatsApp Convert that signal.org sees as their target market, out there on the interwebz:

Outside the question of the tutorial-helpdoc, can you give me an idea how long it is necessary to wait between re-registration-attempts? There was an attempt Friday, and another attempt today, if a third attempt is made eight hours later (with adb logcat supposedly active) will that be rate-limited? If the attempted logcat capture was unsuccessful for whatever reason, and a fourth re-registration attempt is made thirty minutes after the third, will THAT be rate-limited?

greyson-signal commented 6 years ago

Yeah, often times we're simply fortunate enough that many of the bug reporters have a technical background, but I agree that we need something more user-friendly for this case. I'll look into what it would take to get the debug log submission accessible during registration, as this isn't the first time it's caused problems.

Regarding a tutorial, I think if you follow the lifehacker article, then type "adb logcat" into a command prompt / terminal, you should be good. You can just copy-paste the output here. However, be sure to check those logs and make sure they don't reveal any personal information. If the idea of following the tutorial or cleaning your logs makes you nervous, don't worry about getting the logs. Better safe than sorry.

can you give me an idea how long it is necessary to wait between re-registration-attempts

You should be able to make several registration requests (some number less than 10, but I'm not confident enough to give an exact number) back-to-back without getting limited. I often have to do this when testing things that involve fresh install. If you do get rate limited, you usually have to wait around 24 hours.

Given your description, it's unlikely that you're experiencing rate-limiting, unless there's some other bug.

five-c-d commented 6 years ago

This is likely one of those times you are going to be fortunate ;-)

My point was mostly to encourage you, or someone you know, to write the code that will allow us to click a button to generate debug-logs from the registration screen. There is already a button once registered, so I don't expect cloning THAT button and making it visible during the registration process, is going to be especially hard. (But have not written a pull request sorry)

That said, long-term there might also need to be a nice here-is-your-adb-logcat-tutorial, since installation itself does not always go smoothly, as opposed to just registration when the signalapp is already up and functional on the device.

Will let you know in the next couple of days if logcat acquisition is successful, appreciate your assistance

five-c-d commented 6 years ago

Forgot to specify that the carrier is Orange (Belgium / EU), so I wouldn't expect this to be a carrier-specific issue, as they are widely utilized. Perhaps the problem is somehow phone-model-specific or SIM-specific, in addition to having some possible carrier-related idiosyncrasy?

Tried putting the SIM-card from another handset into the problematic phone-device, which "worked"... tried putting the SIM-card from the problematic phone into another handset, which did NOT work. So the mixture of unable-to-connect and never-receiving-sms-nor-robocall-with-conf-code, does seem to be specific to this number+plan+carrier+something combination.

p.s. No debug-logs yet via adb logcat, need to "borrow" admin-permissions from IT folks to be able to install the tools on target laptop, but with luck that will get us over the hump.

five-c-d commented 6 years ago

Okay, adb logcat was a success. There is nothing jumping out that seems to be the root cause of the trouble... captured three unsuccessful registration-attempts...

09-04 12:54:11.587  7547 29455 W Conscrypt: Could not set socket write timeout: java.lang.reflect.InvocationTargetException
09-04 12:54:11.739 20998 14917 W PushServiceSocket: Opening URL: https://textsecure-service.whispersystems.org/v1/accounts/sms/code/+32XXXXXXXXX
09-04 12:55:18.134 20998 15281 W PushServiceSocket: Opening URL: https://textsecure-service.whispersystems.org/v1/accounts/voice/code/+32XXXXXXXXX

09-04 12:56:35.240 20998 15548 W PushServiceSocket: Opening URL: https://textsecure-service.whispersystems.org/v1/accounts/sms/code/+32XXXXXXXXX

09-04 12:57:45.760 20998 17161 W RegistrationActivity: java.io.IOException: SERVICE_NOT_AVAILABLE
09-04 12:57:45.760 20998 17161 W RegistrationActivity:  at com.google.android.gms.gcm.GoogleCloudMessaging.zza(Unknown Source:6)
09-04 12:57:45.760 20998 17161 W RegistrationActivity:  at com.google.android.gms.gcm.GoogleCloudMessaging.register(Unknown Source:67)
09-04 12:57:45.760 20998 17161 W RegistrationActivity:  at org.thoughtcrime.securesms.RegistrationActivity$3.doInBackground(RegistrationActivity.java:479)
09-04 12:57:45.760 20998 17161 W RegistrationActivity:  at org.thoughtcrime.securesms.RegistrationActivity$3.doInBackground(RegistrationActivity.java:468)
09-04 12:57:45.760 20998 17161 W RegistrationActivity:  at android.os.AsyncTask$2.call(AsyncTask.java:333)
09-04 12:57:45.760 20998 17161 W RegistrationActivity:  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
09-04 12:57:45.760 20998 17161 W RegistrationActivity:  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)
09-04 12:57:45.760 20998 17161 W RegistrationActivity:  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)
09-04 12:57:45.760 20998 17161 W RegistrationActivity:  at java.lang.Thread.run(Thread.java:764)
09-04 12:57:53.564  7547 29649 W Conscrypt: Could not set socket write timeout: java.lang.reflect.InvocationTargetException
09-04 12:57:53.707  7547 29649 W Conscrypt: Could not set socket write timeout: java.lang.reflect.InvocationTargetException
09-04 12:57:53.792 20998 17923 W PushServiceSocket: Opening URL: https://textsecure-service.whispersystems.org/v1/accounts/sms/code/+32XXXXXXXXX
09-04 12:58:17.167 20998 18158 W PushServiceSocket: Opening URL: https://textsecure-service.whispersystems.org/v1/accounts/sms/code/+32XXXXXXXXX

Re: "If the idea of following the tutorial or cleaning your logs makes you nervous" ... well, not so much nervous, as TIRED. In capturing those 3 registration-attempts, the logcat textfile ended up with 50k lines aka 7 megabytes. There are a lot of things in there which probably should be scrubbed/redacted, e.g. "facebook video id" and similar, which are presumably unrelated to signalapp. To avoid the worry that I missed something... since I believe I already have your direct contact-info, is it permissible to send this to you @greyson-signal rather than post it here on world-visible github? I understand this is not the usual approach, and will not be offended, in the slightest, if you prefer to politely decline :-)

p.s. I've changed my mind, and now believe this might be related: #7230 == "Unable to complete registration [when phone has PlaySvcs but is not linked to gmail acct], closed-by-bot without getting resolved." Similar behavior with unable-to-connect error, similar logcat with conscrypt-unable-to-set-timeout and java.lang.IOException (albeit not identical), and there ARE some gmail-related-error-messages in the 7MB logcat. What does one do, to verify that one has 'authenticated in Google Play Services' properly? Maybe try installing Push Notifications Fixer from the Google Play Store, https://play.google.com/store/apps/details?id=com.andqlimax.pushfixer.noroot , which supposedly tests whether GCM is functional? Or is that not the same thing?

five-c-d commented 6 years ago

Relatively early on, there were also several of this kind of error-message in the logcat:

 
09-04 12:52:44.056  6319  6517 E InputDispatcher: channel ~ Channel is unrecoverably broken and will be disposed!
09-04 12:52:44.581  6319  6517 E InputDispatcher: channel ~ Channel is unrecoverably broken and will be disposed!
09-04 12:52:44.753  6319  6517 E InputDispatcher: channel ~ Channel is unrecoverably broken and will be disposed!
09-04 12:52:45.081  6319  6517 E InputDispatcher: channel ~ Channel is unrecoverably broken and will be disposed!
09-04 12:52:45.302  6319  6517 E InputDispatcher: channel ~ Channel is unrecoverably broken and will be disposed!
09-04 12:52:45.302  6319  6517 E InputDispatcher: channel ~ Channel is unrecoverably broken and will be disposed!
09-04 12:52:45.561  6319  6517 E InputDispatcher: channel ~ Channel is unrecoverably broken and will be disposed!
09-04 12:52:45.902  6319  6517 E InputDispatcher: channel ~ Channel is unrecoverably broken and will be disposed!
09-04 12:52:46.273  6319  6517 E InputDispatcher: channel ~ Channel is unrecoverably broken and will be disposed!

As shown in the previous comment, the first call to the whispersystems signal-servers happened under 2 seconds later in the sequence:

09-04 12:54:11.739 20998 14917 W PushServiceSocket: Opening URL: https://textsecure-service.whispersystems.org/v1/accounts/sms/code/+32XXXXXXXXX

However, logcat doesn't show any unrecoverably-broken-and-will-be-disposed errors prior to re-registration attempt#2 at 12:56:35, nor prior to re-registration attempt#3 at 12:57:45 either. Conceivably this is because the same broken channel was used during all three attempts... but that would contradict the will-be-disposed-[of] statement in the error-message. There are a couple of places where unrecoverably-broken appears in the Signal-Android issues list, both of them signalapp crashes rather than just signalapp silently getting stuck, or throwing an unable-to-connect error however:

Perhaps-interestingly, adb logcat is showing just a tilde in the textfile dump from windows 10 connected to the samsung s9+ handset, instead of the channel-owner that was visible in the 7866 and 7219 issues linked to above. Is that because the adb logcat behavior has changed, or because the channel in this scenario was unowned, or possibly because there is a setting needed to get adb logcat to display more-verbose output that includes the channel-owner?

greyson-signal commented 6 years ago

Thank you very much for the in-depth information!

This bit of the log stands out to me:

09-04 12:57:45.760 20998 17161 W RegistrationActivity: java.io.IOException: SERVICE_NOT_AVAILABLE
09-04 12:57:45.760 20998 17161 W RegistrationActivity:  at com.google.android.gms.gcm.GoogleCloudMessaging.zza(Unknown Source:6)
09-04 12:57:45.760 20998 17161 W RegistrationActivity:  at com.google.android.gms.gcm.GoogleCloudMessaging.register(Unknown Source:67)

Some research implies that these could be potential causes:

That being said, given your description, none of those seem to match. Unfortunately GCM registration can be a bit of a black box :/

If you still want to send me the full log, you can DM me on the forum. Could be useful to look it over for other leads.

five-c-d commented 6 years ago

(Direct message sent, let me know what you need me to do on this end.)

Yes, that IOException happened just prior to the third re-registration attempt... but it did NOT get thrown in the prior two re-registration attempts... thus, very unclear to me that this can be the root cause of the overall unable-to-re-register difficulty.

Like yourself though, I am suspicious some kind of GCM-or-google-related difficulty might be happening here... maybe versioning-related thanks to the factory-reset. Although prior to posting this bug-report we tried to rule out whether the device had a properly-connected-and-confirmed-goog-account, now I'm no longer sure that was fully verified. If the play store allows one to install google-drive app, and one can login to it from the samsung phone, does that guarantee googAcct/googPlay/GCM are 100% okay? Or is there more to it?

Can pretty well rule out the network-and-firewall thing, there were attempts from a lot of different locations/networks prior to the logcat-session (which I believe WAS at a location where the firewall was especially strict... but again only one IOException appeared during the three sequential re-registration attempts from behind that same firewall the whole time).

I don't think it was the clock-settings, but if it was, wouldn't we see something to that effect inside adb logcat that would indicate the difficulty... java.lang.IOException: YOU_HAVE_VIOLATED_PHYSICS_AND_TRAVELLED_THROUGH_TIME or somesuch?

But yeah, this is a pretty maddening issue. My most likely culprit would be some kind of carrier-specific difficulty... but Orange Belgium/EU subscribers with samsungs, that is a pretty large number of people! We should be seeing more complaints than just one, if it were true that nobody with that carrier was able to register signalapp anymore with their +32.xxx.xxx.xxx telephone number, I would think.

five-c-d commented 6 years ago

Latest troubleshooting attempts:

  1. purchased new SIM-card to eliminate possibility it was some kind of low-level electrical difficulty, did not help
  2. when visiting a friend physically, borrowed their handset temporarily (different model samsung), inserted new SIM-card into their handset, tried re-registering, did not help
  3. visited phone-repair store to complain that carrier was not supporting signalapp properly, convinced two different phone-store employees to install & register signalapp on their devices as a test-case, one of them was unable to register but the other one WAS able to install&register successfully

So almost certainly the problem is carrier-related (since phone-store employee#1 was also unable to register), but just as certainly the problem is intermittent or somehow handset-configuration-dependent, since phone-store employee#2 was able to get it to work.

Is there some kind of super-verbose-tracelog-APK-version of signalapp that can be sideloaded via adb, or whatever, that will help track down this annoying failure more fully? Doubtful that the phone-store employees will be willing to do that, of course, so we probably cannot capture a tracefile of the successful registration, but mayhap a more detailed tracefile of the unsuccessful re-registration attempt will lead to some progress narrowing down the root cause.

five-c-d commented 6 years ago

Okay, resolution of some of the SMS-nor-robocall never arrives difficulties (albeit this does not explain unable-to-connect errormessages which also intermittently would occur).

  1. finally got OrangeBelgium tech-support rep on the phone (calling them unencrypted-cellular via the troublesome s9+ device), and while they were on the line, once again attempted to re-register signalapp thereupon.

  2. OrangeBelgium was able to see that the sms-conf-code generated on signal-server-backend WAS arriving into their network.

  3. however, the SMS was never delivered to s9+ because on the s9+ device at some earlier point, there was a reply saying 'STOP' sent upstream to some sort of SMS adverts-spam (not signalapp-related SMS but something else entirely!) which had been arriving via CM Telecom... and OrangeBelgium therefore no longer delivered any signal-server-generated SMS to that s9+ device, nor incoming robocalls apparently... despite those presumably not actually coming from the same SMS-sender that was 'STOPped'?? Don't understand fully. Possibly the issue was that signal-server was utilizing twilio on the backend, and twilio was in turn using CM telecom to subcontract their SMS-blasts-into-benelux (hypothesis based on "verification SMSes are from German phone numbers used by twilio for German registrations" comment), and blocking one CM telecom sms-advert-thing blocked EVERY communication from CM telecom (and as a side-effect from twilio-and-signal-server?). OrangeBelgium tech advised filing a support-request with https://help.cmtelecom.com/ but that advice has not produced any feedback as yet

  4. the immediate s9+ problem thus was 'solved' by the OrangeBelgium tech-support person intercepting the unencrypted SMS-conf-code on the carrier-side, and then verbalizing the digits over the live unencrypted voice-call, after which the s9+ was able to successfully re-register for signalapp. (It was not necessary to unSTOP the CM Telecom number which had been previously STOPped, because of this workaround.) So at present, signalapp is working on the s9+ for now... and should remain working as long as re-registration is not needed again at some point in the future, with any luck.

If there are more details that ought to be supplied here, please let me know. But it sounds like the fundamental issue is not internal to signal4android, nor even to signal-server codebase, but is more of a difficulty with the way one fifth-party carrier (orangeBelgium) and/or one fourth-party competitor to Twilio (cmTelecom) were treating the STOP reply to an unrelated bunch of SMS alerts... which ended up getting applied over-broadly, and causing not just those SMS alerts to get undelivered, but also the auto-generated signal-server conf-codes sent via third-party twilio from the USA to Belgium.

Dunno how widespread this problem is, or whether it would even be visible in the adb logcat tracefiles, but according to a bit of internet-sleuthing CM Telecom is a Netherlands-based firm... with customers in their own country, Belgium, Luxembourg, GlobalMessaging subsidiary in the UK, branch office in France, some stuff in South Africa, et cetera -- https://en.wikipedia.org/wiki/CM_Telecom plus a few more. Depending on whether this particular STOP issue was a rare quirk that was dependent upon the combination of s9+ device && orangeBelgium carrier && cmTelecom STOP reply, it might be a tiny thing... or it might be a small-but-steady-drain-like-a-vampire on successful signalapp registrations and re-registrations in half a dozen countries. Not sure there is even a way to TELL whether this (specific scenario) is happening or not, since from signal-server's viewpoint it will look like the SMS-conf-code went out but the recipient "changed their mind" about installing signalapp. Unless perhaps there is a blocked-or-stopped-or-something type of status message, that was getting returned from the s9+ or from orangeBelgium or from cmTelecom, which could be processed on the twilio-backend somehow?

On the assumption that this is likely to be a carrier-backend problem that signalapp nor signal-server can even detect reliably (let alone DO anything about), closing this for now as it is 'not a bug' in signal4android, although it is definitely a bug in SOME telephony-configuration-server that the registration flow depends upon, in at least one carrier-plan in one country. My suspicion/hypothesis motivating the close-issue here, is that there was some kind of bug -- perhaps purposefully put there but a bug nonetheless methinks -- in the OrangeBelgium router-and-network configuration where they misinterpreted 'STOP' of one series of CM Telecom texts, to mean not merely stop-sending-texts-about-this-one-subject, but rather the far broader stop-sending-me-calls-or-texts-from-a-bunch-of-services-including-ones-like-signalServer-in-the-USA.

It WAS still possible to register WhatsApp onto the s9+ device, despite the STOP reply which was inadvertently preventing signalapp re-registration, so the problem was not that the entire set of all USA-originating SMSes were 'STOPped' but rather something narrower. Feel free to re-open, if there IS anything that can be done to mitigate or eliminate the difficulty, and allow signalapp in Belgium to compete with WhatsApp when it comes to getting all the way through the registration-flow (despite whatever SMS-dependency-related headaches might occur).

p.s. On that note... I've also posted a feature-request into a new forum-thread which might prevent this situation (and similar kinds of able-to-dial-outbound-but-unable-to-receive-inbound scenarios). In addition to the automated voice which robocalls the recipient's device and verbalizes the conf-code to them audibly, it might make sense to permit the recipient to alternatively dial up something like +1-555-555-5555 extension 55, which can detect the number they are calling from using ANI/CID, and then have the IVR system which picks up the call verbalize the conf-code. This has the downside that (from what I understand) it is easier for a malicious adversary to spoof ANI/CID than it is for a malicious adversary to hijack an inbound-SMS-conf-code, but for the especially-security-conscious endusers I believe the registration-pin-lock would help mitigate this worry.

greyson-signal commented 6 years ago

@five-c-d Thank you for spending so much time digging into this! I really appreciate the effort, and we can hopefully use this as a jumping-off point for an investigation towards a long-term fix.

ghost commented 5 years ago

My cousin in India is on Airtel and has not received either OTP or call when trying to register for Signal.

five-c-d commented 5 years ago

@manasb , can you please have your cousin post something (or ask them to provide you the info so you can post on their behalf), over in this forum-thread which is specifically about Airtel problems == https://community.signalusers.org/t/registering-in-india-airtel/4684

You can login to the signalUsers site with your github credentials, if you would like. There are several people in that thread who speak Hindi, also, if that is helpful. Although I am not one of them :-)

The most helpful thing that your cousin can provide, if they are trying to register with signal-for-android, is to capture a debuglog about 90 seconds after trying the Call Me Instead option. The steps are as follows:

There are a limited number of attempts which are possible, before the signal server will rate-limit the repeated registration attempts (with error HTTP 413). Similarly, to properly register you need to enter an E.164-format international telephone number, not the local India-specific calling format. And of course, if you have a typo in your phone-number when you enter it, you will never receive the SMS-nor-robocall :-) So before capturing a debuglog, it is best to

Most of the Airtel problems were fixed on the signal-server and Twilio backend (a dependency of signal-server) back in October/November, so I am curious why your cousin is having difficulties. The debuglog should tell us what is going awry, I hope.

See also, the India-related github issue #7980 -- this issue 8148 we are talking within, is mostly about registration-problems in Belgium instead. It is okay that you posted here, because I happened to notice your comment, but usually it is best to either create a new github issue rather than comment on a closed issue (unless you want to "reopen and comment") to make sure that your bug-report is actually noticed by somebody.