celzero / rethink-app

DNS over HTTPS / DNS over Tor / DNSCrypt client, WireGuard proxifier, firewall, and connection tracker for Android.
https://rethinkfirewall.com/
Apache License 2.0
3.03k stars 153 forks source link

No WG connection when waking up from sleep unless i toggle 'Lockdown Mode' #1367

Closed nomisma-qt closed 6 months ago

nomisma-qt commented 7 months ago

Hello, after trying to manage without a smartphone the last 10 years or so, The bank demanded i use a banking app. I now finally have a Redmi Note 11 with crDroid flashed. It's afaik the latest lineageOS and Android 14 (i think)

Edit: and latest Rethink 0.5.5e from F-droid

Edit2: I have tested with DoT to a custom DNS server, Rethink Odoh, and Rethink RDNS, same issue with them all. In Wireguard settings i had to remove the ipv6 settings for DNS that were set when i set up using QR code, and replace with generic 1.1.1.1. This is a known bug afaik.

I have high hopes for Rethink, especially as it allows me to exclude the banking apps from being routed through my WG VPN.

I hope opening this issue here with so little information or logs is not a big issue, as getting logs uploaded to Desktop etc is a problem atm, and i suspect the error i'm having is pretty common, but i haven't found any answer yet searching here..

The thing is when i wake up the phone from sleep, for ex. i have a weather app, and i tap the refresh button, or open a browser, both routed through WG using Rethink, i get no connection, unless i go to Rethink settings, and toggle WG Lockdown mode off, and on again, or on, and off again.

Sometimes after that the browser says 'connection was interrupted, network changed' or something, and then the connection works again.

I'm having a hard time understanding what 'Lockdown mode' really does.. The explanation says:

"Selected apps will only be routed through this Wireguard VPN, regardless of whether the VPN is failing or connected"

As opposed to what, if lockdown mode is disabled, and the WG is "failing" (another oddity, more about that later), then what would the selected apps be routed through?

And what is the relationship between 'lockdown mode' and the native android 14 setting 'Always-on VPN' in Android 14 Settings - Network & Internet - VPN - Rethink - Settings?

What does the weird 'Failing', 'Starting', 'Active' and 'Idle' modes mean in WG? With Wireguard, there's either a handshake or not. And one can display a timer that counts the amount of seconds since last handshake. There is no other mode in WG. 'Failing' or 'Starting' makes no sense.

I have a few other big annoyances, but trying to keep this post only about the 'lockdown' issue, it's highly annoying. One thing that might be related, In Android 14 Apps options, should Rethink be 'Optimized' or 'Unrestricted' in Android 14 Settings - Apps - Rethink - App battery usage ?

I have tried both, and it doesn't seem to affect the issue..

Hope this post is not in the wrong forum.. I guess i could get some logs, but i still have issues connecting it to the desktop, and i'm old and have a hard time working on phone with small screens etc.. Regards.

luckygitt commented 7 months ago

My first question is why don;t you want the banking app to use your Wireguard tunel?

Anyhow, couldn't agree more with most of the rest of your post re. Wireguard states and Lockdown, Always on etc. - I am totally baffled too. When I set an active Wireguard tunnel I ALWAYS want all traffic to use it! Granted, you can "split-tunnel" and use more than one Wireguard config, but for me, all traffic has to go through a tunnel.

As an aside, why are all apps shown in the Wireguard setting - woudn't it be better to only show Allowed apps?

nomisma-qt commented 7 months ago

My first question is why don;t you want the banking app to use your Wireguard tunel?

I want my access to my domestic bank to come from my domestic IP registered to my name. Lately i have heard of local banks even blocking traffic from known VPN providers, but even if my bank didn't do that, i'd want my official banking business come from a official domestic ISP ip, that can link that IP to my name and address.

When you log into your local bank from your VPN, well that actually decreases the security you get from that VPN, because your bank will be logging all IP's related to your bank logins, and thus your VPN will be known to your local financial and police authorities easily.

As an aside, why are all apps shown in the Wireguard setting - woudn't it be better to only show Allowed apps?

That's one other annoyance for me, when i install a new app, it should by default be going through the tunnel, ie be selected. Now i have to wait for it to appear in the app selection list, that takes awhile, and include it, before i can start using it.

AND there is no way to list the unchecked apps!, So to be sure i didn't forget to include some app i installed, i have to manually scroll through the entire list, looking for unchecked boxes. Very annoying.

Yet, i do love RethinkDNS, and i know it's version 0.x, and i have high hopes, i think version 1.0 will be great.

ignoramous commented 7 months ago

In Wireguard settings i had to remove the ipv6 settings for DNS that were set when i set up using QR code, and replace with generic 1.1.1.1. This is a known bug afaik.

This isn't a known bug. What exactly is happening here? Do you mean, if WireGuard is imported using QR code, then DNS setting is always 1.1.1.1?

The thing is when i wake up the phone from sleep ... i get no connection

Can you see if workarounds discussed in #1368 (setting KeepAlive between 30s to 3600s for the Peer works)?

As opposed to what, if lockdown mode is disabled, and the WG is "failing" (another oddity, more about that later), then what would the selected apps be routed through?

If WireGuard is not Lockdown, apps will be routed through the underlying network (wifi or mobile) when WireGuard is disabled.

And what is the relationship between 'lockdown mode' and the native android 14 setting 'Always-on VPN'

Why assume there's any relation at all? There isn't any.

What does the weird 'Failing', 'Starting', 'Active' and 'Idle' modes mean in WG

They mean exactly that?

'Failing' or 'Starting' makes no sense.

You can switch to other languages from Configure -> Settings -> Change language

And one can display a timer that counts the amount of seconds since last handshake

Handshakes always happen every 2mins. When they don't, the WireGuard connection to that Peer is probably broken, and may recover at a later time (or not). We keep track of this, but don't show it in the UI, currently. In the next Rethink version v055f, we'll show this.

it's highly annoying

Good thing the app is free?

nomisma-qt commented 7 months ago

In Wireguard settings i had to remove the ipv6 settings for DNS that were set when i set up using QR code, and replace with generic 1.1.1.1. This is a known bug afaik.

This isn't a known bug. What exactly is happening here? Do you mean, if WireGuard is imported using QR code, then DNS setting is always 1.1.1.1?

I run my own WG peer on a VPS (trailofbits/algo project.) It provides a QR code for setup. That QR code includes the default DNS server configured in the settings on the VPS server, with both ipv4 and ipv6 addresses. If i use that QR code to set up my android phone to use WG for example with the "WG Tunnel" app by Zane Schepke, found on F-Droid, then the VPN tunnel works fine.

In rethink, DNS does not work by default. In 'wg3' edit mode, where i edit the name, private key, public key, peer addresses and DNS servers, there are two addresses for the DNS, ipv4 and ipv6.

I have to remove the ipv6 address after the comma. This is a known issue

So instead i set a standard 1.1.1.1 single address in that field, and then enabled oDoH / rdns etc separately, and that works (tested with dnsleaktest.com)

The thing is when i wake up the phone from sleep ... i get no connection

Can you see if workarounds discussed in #1368 (setting KeepAlive between 30s to 3600s for the Peer works)?

Yes i will test, but would like to know how that affects sleep behaviour and battery life? When phone is sleeping, and some app like for ex. a weather app tries to update every 3 hours.. Would that require a keepalive.. But i'll test, also wait for new version.

Btw, i recently updated crDroid, it has a new kernel, and this issue might be abit better, but still not sure.. Testing is slow because sometimes just double tapping the screen to see lock screen, and there tapping the weather refresh button, sometimes it works, sometimes not. But when it doesn't, there seems to be a 'Network change' If i quickly unlock, and try to refresh a browser page while looking at the send receive arrows and network speed in the status bar.. it goes to all zero, then wakes up again, and browser page with error something like 'error network changed, connection was interrupted' ... will try to take screenshot if i see it again.

As opposed to what, if lockdown mode is disabled, and the WG is "failing" (another oddity, more about that later), then what would the selected apps be routed through?

If WireGuard is not Lockdown, apps will be routed through the underlying network (wifi or mobile) when WireGuard is disabled.

Ah ok, this explains it better. Then i would suggest changing the explanation from:

"regardless of whether the VPN is failing or connected"

to:

"regardless of whether the RethinkDNS is Started or Stopped"

or:

"regardless of whether the RethinkDNS is Enabled or Disabled"

And what is the relationship between 'lockdown mode' and the native android 14 setting 'Always-on VPN'

Why assume there's any relation at all? There isn't any.

Well if native android 14 setting for Rethink 'Always-on VPN' is disabled and Rethink is NOT in 'Lockdown Mode', or IN 'lockdown mode' , or any combination of above, how is traffic routed?

You understand this results in 4 possible combinations, and it's confusing.

What does the weird 'Failing', 'Starting', 'Active' and 'Idle' modes mean in WG

They mean exactly that?

'Failing' or 'Starting' makes no sense.

There is nothing wrong with my WG peer, it accepts traffic with the correct public key. How is that 'Failing'? Like an old car sputtering, just before it breaks down?

What is going on when it is 'Starting'? It makes no sense at all.

You can switch to other languages from Configure -> Settings -> Change language

And one can display a timer that counts the amount of seconds since last handshake

Handshakes always happen every 2mins. When they don't, the WireGuard connection to that Peer is probably broken, and may recover at a later time (or not). We keep track of this, but don't show it in the UI, currently. In the next Rethink version v055f, we'll show this.

That's good to know. How is that related to the KeepAlive setting? Does it mean the default KeepAlive is 120 seconds?

it's highly annoying

Good thing the app is free?

Very good :) Don't get me wrong, i'm very thankful for your work, and hope i can help to improve Rethink in any way i can, with zero developer and coding skills, just a happy end user.

EDIT: As a additional comment, just like the topic says, when i wake up the phone from sleep, and there is no connection.. Then i got to Rethink, click 'Proxy', click 'wg3', and toggle 'Lockdown mode', which i have enabled, off, and then immediately on again, then the internet connection immediately works.

If i go back to a browser immediately after this, i sometimes get the 'connection interrupted, network change' thing, and a browser page reload then works.

ignoramous commented 7 months ago

Yes i will test, but would like to know how that affects sleep behaviour and battery life?

I haven't profiled it, but I imagine it does affect battery. And so, you may want to experiment with the values and settle on the highest that works for your server / network setup. This shouldn't be needed in v055f, where we're adding a "auto-recovery" mode for seemingly dead WireGuard connections.

"regardless of whether the RethinkDNS is Enabled or Disabled"

Thanks. Will consider changing it.

You understand this results in 4 possible combinations, and it's confusing.

Android's Always-on and Lockdown VPN modes are documented here: https://support.google.com/work/android/answer/9213914?hl=en

Rethink's use of those terms for WireGuard are independent of those.

How is that 'Failing'? Like an old car sputtering, just before it breaks down? What is going on when it is 'Starting'? It makes no sense at all.

Failing could happen for any number of reasons (connectivity loss, server down, connection down, protocol mismatch, handshake failures, configuration mismatch etc).

Starting is the client waiting for the remote Peer to acknowledge the connection.

That's good to know. How is that related to the KeepAlive setting? Does it mean the default KeepAlive is 120 seconds?

No. Handshakes are not KeepAlives. If they were, there wouldn't be 2 names for it :D Handshakes must happen every 2mins (if not, connection between WireGuard Peers may or may not irrecoverably break down) and this is independent of how often the KeepAlives happen.

i can help to improve Rethink in any way i can, with zero developer and coding skills, just a happy end user.

Please, by all means keep feature requests and bug reports coming. For instance, this detailed issue you filed made us write that extra code we've been avoiding to implement "auto-recovery" for broken yet on-going WireGuard Peer connections.

Then i got to Rethink, click 'Proxy', click 'wg3', and toggle 'Lockdown mode', which i have enabled, off, and then immediately on again, then the internet connection immediately works.

We're adding auto-recovery, so you'd not need this dance from the next version onwards (hopefully).

This is indeed strange. Setting / unsetting Lockdown for WireGuard shouldn't really change anything. I'll see what's going on. Can you also see if tapping on the Refresh icon at the top right-hand corner in Configure -> Proxy UI also makes it work?

If i go back to a browser immediately after this, i sometimes get the 'connection interrupted, network change' thing, and a browser page reload then works.

Rethink already goes to great lengths to avoid such broken connections (due to its firewall, DNS, and SOCKS/HTTP proxies), but WireGuard is new (and is very different too) and so such bugs are expected, and hopefully we iron the annoying ones out as they get reported.

nomisma-qt commented 7 months ago

This is indeed strange. Setting / unsetting Lockdown for WireGuard shouldn't really change anything. I'll see what's going on. Can you also see if tapping on the Refresh icon at the top right-hand corner in Configure -> Proxy UI also makes it work?

Yes, already tried, and that also makes it work.

nomisma-qt commented 7 months ago

I'd also add, that maybe the lockdown tapping, or the refresh button tapping isn't the issue at all, but instead after phone has been asleep, and one simply switches to, or starts RethinkDNS, that actually somehow brings Rethink back from being in background mode. That it's just bringing the Rethink screen up that 'wakes up' the connection again.

EDIT: wow. Actually just tried this theory, and yes.

After i unlock phone with fingerprint, immediately go to browser, and reload a page. No connection. Then i just from recent apps switch to Rethink, ie bring up RethinkDNS to main focus, and then immediately go back to browser, sure enough, the 'Network change was detected' and connection resumes.

screenshot chrome: ![Screenshot_20240417-155154_Vivaldi](https://github.com/celzero/rethink-app/assets/6612638/27b2116f-4e32-4fab-b13b-20a06756e9c4)
nomisma-qt commented 7 months ago

This might be related to how Android 14 handles Battery Usage. The behaviour above happens with Android settings - Battery - Battery Usage (view by apps) - ReThink = "Allow background usage" enabled, and within "Allow battery usage" settings, 'Optimized' enabled.

I will continue testing with "Allow background usage" disabled, OR with 'Unrestricted' Enabled etc..

Testing is slow because after i lock phone, i really have to wait an unknown time, before the behaviour repeats. If i just lock and unlock right away, the connection is still working.

ignoramous commented 7 months ago

I will continue testing with "Allow background usage" disabled, OR with 'Unrestricted' Enabled etc..

Yeah, that makes sense. Android (in this case, GrapheneOS?) may be hibernating Rethink and so the connections are indeed going no where.

nomisma-qt commented 7 months ago

I will continue testing with "Allow background usage" disabled, OR with 'Unrestricted' Enabled etc..

Yeah, that makes sense. Android (in this case, GrapheneOS?) may be hibernating Rethink and so the connections are indeed going no where.

This is crDroid 10 for Xiaomi Redmi Note 11

Preliminary testing suggests that disabling 'Allow background usage' in Android Settings - Battery usage - Rethink might be the issue. Ever since i disabled it, i've had a connection immediately on wakeup.

Although it is a bit counter intuitive, as one would think Rethink specifically needs to be able to work in background.

I'll continue testing, and new phone, so soon i'll be installing an email client, that's gonna have to work in background through WG, and alert on incoming mails.

Also as long as i had this phone/ROM, Rethink has always been on the top of that list, with high screen time, and high background time compared to other apps.

I don't know about battery usage since i really have no reference, i guess i could play with betterbatterystats or similar, but i'd rather keep all kinds of gimmicks to a minimum, and let Android 14 do it's magic.

Will know for sure tomorrow if this is the culprit.

nomisma-qt commented 7 months ago

So.. pretty dramatic morning.. I have this little weather app on lock screen with a refresh button, i use to test the connection quickly, just by double-tapping the locked phone, and hitting refresh. With the issu at hand, that always failed, as there was no connection after unknown time of sleep.

And so after disabling 'Allow background usage' yesterday, all through the evening everything looked great, i tested regurarily using the lock screen weather app, and it got updates fast. This morning i test the weather app, updates immediately, then unlock phone - browser, and reload my duckduckgo page with query 'what is my IP?'

And no VPN. traffic comes from my ISP ip. I switch to Rethink from recent apps, and it's enabled, with the large STOP button., i go to configure - proxy and hit refresh upper right corner, it refreshes, back to browser, still no VPN ip. Didn't remember what it said on proxy status, i think it was 'failing' maybe..

But Rethink is active, not allowed in background mode, lockdown mode enabled, and still lets traffic through bypassing the proxy.

Hitting STOP, the START, and traffic is again proxied.. But then after about 5 mins locked, trying to refresh the duckduckgo page is slow.. like issue before..

But clearly a serious event, and all through yesterday evening when i simply tested with the weather lockscreen app, i don't know if proxy even worked.. didn't think to test since it never happened before..

nomisma-qt commented 7 months ago

I'm able to connect phone to desktop, and might be able to provide logs, if you provide instruction on how..

BTW Configure - settings - Enable on-device logging has been disabled through this test.. But i'd imagine devs would first like to replicate the issue on their own phones.... just ask for any details and i'll answer through the day

Basically this is crDroid 10.4. And Rethink v0.5.5e (from F-Droid), Always-on VPN enabled in Android settings - Network & Internet - VPN - Rethink.

In Rethink configure - settings "auto-start on power-up is enabled. Everything disabled in configure - network. configure - proxy - wireguard in advanced mode, lockdown enabled. browser and weather app selected in add/remove

nomisma-qt commented 7 months ago

Further testing indicates original issue is caused by 'Allow in background + Optimized' setting (the default in Android 14). Been running with 'Allow in background + Unrestricted', and in that mode i don't have to bring Rethink to front from recent apps to get a connection, but it still takes many seconds for a connection to be established when coming out of sleep.

Wireguard is so easy on CPU that it should not be a problem keeping a tunnel alive in background compared to normal TCP/UDP traffic.

But again, disabling 'Allow in background' resulted in a catastrophic data leak where after overnight sleep, data just ignored the proxy tunnel and went out cleartext from ISP gateway.

These issues seem relevant to latest android 14 / latest lineageOS. Understandably it might be difficult to optimize Rethink for earlier android versions as well as 14..

ignoramous commented 7 months ago

disabling 'Allow in background' resulted in a catastrophic data leak where after overnight sleep, data just ignored the proxy tunnel and went out cleartext from ISP gateway

Is Rethink leaking connections or the OS?

LineageOS once had a history of mucking with VPN APIs, iirc.

Further testing indicates original issue is caused by 'Allow in background + Optimized' setting (the default in Android 14

Thanks. We'll add a dialog box that warns users about this in v055f.

nomisma-qt commented 7 months ago

disabling 'Allow in background' resulted in a catastrophic data leak where after overnight sleep, data just ignored the proxy tunnel and went out cleartext from ISP gateway

Is Rethink leaking connections or the OS?

I don't know, all i know is reloading the duckduckgo page with query 'what is my ip' (which shows your IP) showed that traffic was coming from ISP ip. I had to stop and start Rethink to get it to route the packets through WG again, even as it was running with lockdown enabled.

I should start logging more, but pretty busy, and waiting for v055f..

LineageOS once had a history of mucking with VPN APIs, iirc.

Further testing indicates original issue is caused by 'Allow in background + Optimized' setting (the default in Android 14

Thanks. We'll add a dialog box that warns users about this in v055f.

I tested a GPS tracking app (Geo Tracker), that when installed noticed it was set to 'Allow in background + Optimized', and prompted user to change that to 'Allow in background + Unrestricted'

So that would indicate there is a way to code your app to check this at first run, and prompt user. Of course it would be great to get Rethink to run in background using as little battery as possible. Right now it's constantly at the top of list of apps using most battery in Android 14. Hope you will find more ways to optimize it in the future.

Still with WG, as the packets are sent to the peer with the correct key, the peer just accepts and forwards them, so there should be extremely little delay vs. a non-tunneled connection. When waking up there is still several seconds before traffic starts moving with current version..

nomisma-qt commented 7 months ago

on my OPNsense router there is a 'persistent keepalive: every 25 seconds' default setting for the WG peer, and there is a timer since last handshake..

But i'm not sure if this is really needed on a phone where battery life matter. Since WG has no sessions, and just pushes UDP packets around.. But i'm no expert..

edit: i'm sure you're aware of the MSS size issues with WG though.. my speeds are pretty erratic, and i don't know, but i do know that in OPNsense it's important to set the correct MSS (1420) for the WG interface, otherwise connection will be slow, tons of retransmissions, some webpages load, some don't etc etc..

But i'm sure devs know this.. just thought i'd mention

edit2: once the next version is released and i have more time, i might start testing the connection with iperf3 or something.. I don't know what you guys are testing WG with, but if you're just using a VPN provider that supports WG, then i'd suggest running your own WG server on a VPS.. trailofbits/algo project is excellent for that..

I could run wireshark on my peer as well, but i don't have the skills or knowledge to know what to look for.. If you need such testing i can assist as much as is possible here.. I have a unmetered 50 mbps 4G connection on my phone, and root access to the peer, that is running on ubuntu 22 something

ignoramous commented 7 months ago

Right now it's constantly at the top of list of apps using most battery in Android 14.

Shouldn't exceed 15%. On my Androids, Rethink's around 5% to 10%. How much is it on your device?

i might start testing the connection with iperf3 or something

We haven't yet enabled/imported the latest WireGuard bandwidth-related optimizations that Tailscale contributed (ex) (it is in our pipeline to do so), but it isn't trivial.

it's important to set the correct MSS (1420) for the WG interface

MTU? Yeah, we attempt to auto-correct MTU to an extent; and have added some more code around it in v055f.

WG has no sessions

There are sessions, but WG can "roam" about without requiring a "reconnection", yes.

So that would indicate there is a way to code your app to check this at first run, and prompt user.

We'll probably prompt everytime the user is on Rethink's homescreen (when it is actively running as a VPN).

nomisma-qt commented 7 months ago

Right now it's constantly at the top of list of apps using most battery in Android 14.

Shouldn't exceed 15%. On my Androids, Rethink's around 5% to 10%. How much is it on your device?

I just charged my phone, so the counter was reset, and to be honest, been playing around with rethink alot, so i don't have a reliable average yet..

i might start testing the connection with iperf3 or something

We haven't yet enabled/imported the latest WireGuard bandwidth-related optimizations that Tailscale contributed (ex) (it is in our pipeline to do so), but it isn't trivial.

I actually spend the last 60 mins or so running iperf3 between WG peer and my phone, testing with and without tunneling termux through Rethink WG, and i saw no problems. speed and retransmissions were roughly the same, so no issues.

it's important to set the correct MSS (1420) for the WG interface

MTU? Yeah, we attempt to auto-correct MTU to an extent; and have added some more code around it in v055f.

Well, on my WG peer, MTU on the wg interface is set to 1420, but on my OPNsense router (client) MSS is set to 1420, MTU is default 1500. I don't think MTU should be 1420 on both interfaces, only the peer, and then the client should 'clamp MSS' or something..

2024-04-22 04 30 52 192 168 1 1 b2c6138cd189

I don't remember why, as it was along time since i configured this, according to trailofbits/algo documentation.. but i don't think this is an issue, as iperf3 tests went well, and there would be much bigger problems if the MSS/MTU setting in Rethink were wrong..

WG has no sessions

There are sessions, but WG can "roam" about without requiring a "reconnection", yes.

So that would indicate there is a way to code your app to check this at first run, and prompt user.

We'll probably prompt everytime the user is on Rethink's homescreen (when it is actively running as a VPN).

When i implemented trailofbits/algo, i think tailscale wasn't even a thing yet, i need to start reading up on it, maybe someday implement it on my VPS, as it's the big project now..

Anyways to me basic iperf3 tests looked just fine.. thanks for replies, and excited to test v055f

ignoramous commented 6 months ago

v055i has been out on F-Droid and Website for a while now. Please test and feel free to reopen in case any of the issues discussed here haven't been resolved. Thanks.