Open kieranlane opened 1 year ago
For reference this is the rageshake logs: https://github.com/matrix-org/element-android-rageshakes/issues/50380
Most recent logs following patch: logcat4.log
Hi @bmarty,
could you have a look at this issue today? It would be great if we could offer the client a fix until Monday because I am meeting the client on Monday in person. Thanks. I appreciate your work on this issue.
Best wishes,
Moritz
@kieranlane I cannot find the words Fail to setPusher
in the file logcat4.log
.
We suspect the Worker AddPusherWorker
to never start wecause is has a constraint to start only when network is available.
Other Workers have this constraint, and so can fail:
What can be done to try to unblock quickly is to remove the Network constraint, at least to check that this is the source of the issue you are trying to fix. Here is a patch with this: noNetworkConstraintOnAddPusherWorker.patch
For long term solution, more work would be necessary to fix all the identified issue.
Thanks @bmarty, I've tested this on three devices (Pixel 6 / 2 * Pixel 2XLs) and the fix works, the endpoint is now successfully registered. However there does appear to be a quirk to the notifications actually coming in:
Presumably this is a an caused by possibly the things that can fail, most likely ("initial sync request, if the first one fails because of network error."and "background sync when a Push is received")
I can confirm attempting to send an attachment just sits 'Waiting' forever, as expect.
@bmarty - just adding that another airgapped customer has now run into this, specifically with media not being sent via the app within their airgapped environment. (Believe that is likely due to the Network constraint) - perhaps this issue should be renamed to be more generalised now the source of the problem has been identified?
Perhaps something like Network Constraint: On Add Pusher Worker - breaks core functionality within Airgapped environments
By default, android uses these endpoints to check connectivity :
You can set your DNS and a webserver to respond 204 to one of these (HTTP) request, then your devices will work like there is network and element will work properly
Thanks @p1gp1g , on our side we had identified and tested:
So maybe worth also testing https://www.google.com/generate_204
CC @giomfo @simaddis
Steps to reproduce
Steps to reproduce
Issue History so far:
Initial Query
Hi š I'm trying to confirm the process for setting up NTFY within an airgapped environment with the app. I think I'm almost there. I believe the issue is with having the NTFY push gateway registering within the app. (Details within thread)
Setup is as follows: (Note all using wildcard certs issued by CA on Server1 - Android app has CA.pem installed)
192.168.0.1
)192.168.0.2
as DNS)192.168.0.2
)example.com
/synapse.example.com
/element.example.com
-> 192.168.0.2ntfy.example.com
-> 192.168.0.4https://example.com/.well-known/matrix/server
setsm.server
->synapse.example.com:443
192.168.0.1
->192.168.0.4
in IP whitelist192.168.0.4
)https://ntfy.example.com
)192.168.0.2
)ntfy
https://ntfy.example.com
im.vector.app
I believe the issue is that there are still no registered push gateways listed within the Element app. (
Advanced Settings
->Notification Targets
saysNo registered push gateways
) Going toNotifications
->Troubleshoot Notifications
I get the following:FYI Running the trouble-shooter and immediately going to the home screen (so it's no longer the active app) causes the Test Push to fail. (NTFY doesn't receive any notification / etc.)
NTFY application is happily connected to the created topic (and a test one, which does correctly receive notifications)
Troubleshooting with @MatMaul
Server 1 is the DNS server, and pings to ntfy.example.com work successfully (it actually also self-hosts an NTFY web client so can be sure it's working from Server 1)
Initially trying to troubleshoot in the on-prem support, was told "It is the responsibility of Matrix clients to register "pushers" with their homeserver. The pusher contains the URL to the push GW to use" - So I guess I could try registering a pusher using the api manually, see what comes back
Those are the logs (I triggered the troubleshooter at 13:36) - I guess the
synapse.storage.databases.main.event_push_actions
are the things related to the registration? I'm not seeing anything immediately obvious as to why it fails š¤Sure will also clear the cache / storage of both NTFY and Element apps and try from the beginning for good measure
Thats the logs from the point of first login -- most is just registering the new device etc. -- nothing matching
/_matrix/client/v3/pushers/set
though So I guess that means the server isn't seeing such a request come in? Checking manually going tohttps://synapse.example.com/_matrix/client/v3/pushers/set
works at least from Server 1 / 2 and the phone.Can access that from all - actually tried again to double check and get the following:
It's doing
/_matrix/client/r0/pushers
not/_matrix/client/v3/pushers
, but it's an actual request in that direct so probably closer to what you were expecting to seeWent on a tangent here about NTFY server version / apk package name which were ultimately ruled out as potential issues in the process
Rageshake Issue / Logcat tests
Thanks for all the help on this so far!! I've gone through and done a test and filtered the logs from Synapse, NTFY and the logcat to between 19:22pm -> 19.23pm (where I ran through all the steps). I'm collating everything within https://github.com/matrix-org/element-android-rageshakes/issues/50380 š¤ There's something in there that explains what's going on. Also going to dig around with the Admin API to list the pushers etc. to see if there's anything at all listed for the user.
I'm hoping the above has enough information within to help figure out what's going wrong with the registration. I manually merged all the logs into a combined log ordered chronologically -> https://github.com/matrix-org/element-android-rageshakes/files/10902597/combined\_and\_ordered.log
Troubleshooting with @bmarty
I filtered the logcat based on the package name - hopefully thats okay. logcat.log
so for referenceI adb shell'd on. Then did logcat | grep im.vector.app
Perhaps I was too hasty ending the logcat? logcat2.log
That one I doublechecked it appeared in NTFY afterwards too - which it did, and left a little longer after. Process followed is:
To cover all bases, (and attempt to have a complete logcat that's not lightyears long) - I setup aosp on a old pixel 2xl (so hopefully as little other things going on in the bg as possible unlike personal phone) and did a run through from scratch. Below is that logcat. I didn't filter on application name or anything and I followed these steps:
logcat.log
That's how I was previously changing when troubleshooting with MatMaul - but can give it try like that and send over the logcat - see if it throws up anything different / useful. Just to confirm: the problem is that the HS does not notify the Push gateway, right?
Erm I'm not sure on the specifics, possibly, the only concrete error message I have is the troubleshoot flow stating that "Failed to register endpoint token to homeserver: Unknown Error"
I can at least confirm from this https://unifiedpush.org/users/troubleshooting/ that the issue isn't with the distributor (i.e. NTFY) as I successfully receive a Unified Push notification using the UP-example app
Clicking the 'Reset Notification Method' button that appears - just presents the change notification method prompt. Clicking NTFY again and re-running tests has the same error unfortunately
Confirmation of Synapse -> NTFY -> Phone flow works via Client-Server API
So fairly sure now that the Home Server and NTFY are working OK and it's something the app isn't doing right.
Using the ClientServer API I have registered a new pusher with this cURL command:
I then have sent a message from
computer
user tonotify
user. NTFY receives the notification, and publishes tohttps://ntfy.example.com/example
(the push key). I have manually subscribed to thentfy.example.com/example
topic and receive on the phone the json notification stringThis method above obviously doesn't use UnifiedPush ?format?, so what I get on the phone is the raw notification json, but as far as the overall flow... registering a pusher to the homeserver, the homeserver pushing successfully to the gateway, and the gateway sending to a device it works.
Patch file sent for more detailed logs on custom built apk
Logs attached! I will also make a proper issue on the android github and collate all these threads / msgs
logcat4.log
Outcome
What did you expect?
Phone to successfully register endpoint token with homeserver so that notifications are served successfully to the phone when the application isn't in focus via NTFY.
What happened instead?
No notification is ever received by the phone due to the failure to successfully register the pusher with synapse.
Your phone model
Pixel 6 & Pixel 2 XL
Operating system version
Android 13
Application version and app store
1.5.28-dev [40105280] (G-448374fc) feature/bma/certList 448374fc (both with and without Patch applied
Homeserver
On-Premise Installer 02.01 (Believe thats Synapse version 1.72.0)
Will you send logs?
Yes
Are you willing to provide a PR?
Yes
Outcome
Outcome
What did you expect?
Phone to successfully register endpoint token with homeserver so that notifications are served successfully to the phone when the application isn't in focus via NTFY.
What happened instead?
No notification is ever received by the phone due to the failure to successfully register the pusher with synapse.