mozilla-mobile / mozilla-vpn-client

A fast, secure and easy to use VPN. Built by the makers of Firefox.
https://vpn.mozilla.org
Other
453 stars 108 forks source link

MMS Messaging Broken on iPhone #4842

Closed todddevice closed 1 year ago

todddevice commented 1 year ago

iOS 16 on 2.10 can't send or receive MMS messages when VPN is on. Message send/receive failure seems to occur whether or not phone is connected to WiFi, but combined with #4823, it is difficult to test further. Non-MMS messaging works without issue.

┆Issue is synchronized with this Jira Bug

data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

I was able to replicate this over the weekend. SMS works fine, but MMS (sending a photo, or a group SMS message) fails.

Tested on an iPhone SE (2nd gen) running iOS 16.1, while on WiFi. Cell carrier is Mint Mobile.

data-sync-user commented 1 year ago

➤ Sarah Bird commented:

Is this is still a problem on 2.11.1?

data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

Just confirmed, yes still problem on 2.11.1

data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

Short version: This has something to w/ IPv6 DNS. If we only allow DNS via IPv4, this works just fine. I’m going to talk w/ others about what this means.

Dump of my day follows:

- [ ] They’re packet based. Will the IPv6 fix also fix this?
    - [ ] Tried latest main branch, and nope.
- [ ] Also confirmed: This bug exists on both Wifi and cellular.
- [ ] Baku suggested looking at `includeAllNetworks`. Turning it to `false`, and testing.
    - [ ] Still doesn’t work.
- [ ] Quick google search
    - [ ] “Wireguard-apple” “mms” -> one result, useless
    - [ ] “Vpn” “mms” “iOS”
        - [ ] Okay, this seems like yet another iOS 16 issue, identical reports on ExpressVPN: https://discussions.apple.com/thread/254198717 
        - [ ] Older one, and not 100% sure I agree b/c I believe (but not confident) this worked fine on iOS 15: https://discussions.apple.com/thread/7427204 
        - [ ] This basically just confirms that it’s an iOS 16 issue: https://www.reddit.com/r/ios/comments/y6g466/ios_16_has_big_issues_when_a_vpn_is_enabled/ 
        - [ ] This one too: https://www.reddit.com/r/Express_VPN/comments/y9eyvw/express_vpn_preventing_my_iphone_from_sending/ 
- [ ] From that apple forum thread above, look at the bottom - what is that ExpressVPN option that allowed the user to solve it
    - [ ] Ok, so ExpressVPN has an option that says “if VPN drops, kill all network traffic”. And disabling that allows MMS to work for their users.
- [ ] Let’s look at some other VPN’s forks of `wireguard-apple`, to see if they’ve had a similar thing they’ve worked around
    - [ ] Express VPN (b/c above) - not open source
    - [ ] Mullvad - nothing in there
        - [ ] But their app hasn’t been updated in 7 months, according to App Store? What?
    - [ ] ProtonVPN - nothing in there, but they also don’t seem to have fixed the big iOS 16 bug on their fork, which seems weird
        - [ ] But according to their version history, they’ve fixed it. Maybe their public repo isn’t up to date?
    - [ ] NordVPN - not open source
- [ ] So does Mullvad not have the problem (and thus it’s in our code and not wireguard), or do they?
    - [ ] With these versions mentioned above, not sure this will be helpful. Will come back to this if needed.
- [ ] Lets look at iOS logs
    - [ ] Took logs of successful and unsuccessful group message (MMS) send
    - [ ] Searching for “MMS http response”, you can see the success/failure.
    - [ ] It fails with these lines:
        - [ ] error 11:35:50.840573-0800    CommCenter  Task <6A799FF4-053A-4119-BC08-7B5460E7A452>.<1> finished with error [-1001] Error Domain=NSURLErrorDomain Code=-1001 "The request timed out." UserInfo={_kCFStreamErrorCodeKey=-2102, NSUnderlyingError=0xa5ce66d00 {Error Domain=kCFErrorDomainCFNetwork Code=-1001 UserInfo={_kCFStreamErrorCodeKey=-2102, _kCFStreamErrorDomainKey=4}}, _NSURLErrorFailingURLSessionTaskErrorKey=<private>, _NSURLErrorRelatedURLSessionTaskErrorKey=<private>, NSLocalizedDescription=The request timed out., NSErrorFailingURLStringKey=<private>, NSErrorFailingURLKey=<private>, _kCFStreamErrorDomainKey=4}
        - [ ] default   11:35:50.842087-0800    CommCenter  #I Http response received...
        - [ ] default   11:35:50.842628-0800    CommCenter  #I <private> derived response : 0
        - [ ] default   11:35:50.843271-0800    CommCenter  #I MMS http response:
        - [ ] default   11:35:50.843900-0800    CommCenter  #I <private>
        - [ ] default   11:35:50.844512-0800    CommCenter  #I <private> http response: -1 NSUrlError: -1001(<private>)
    - [ ] So should search before these
    - [ ] The logs don’t have much from our network extension, frustratingly
    - [ ] Is it because these packets shouldn’t go through the VPN? Probably. This was our hunch.
        - [ ] Evidence towards this (which was the hunch all along): In the failure logs, we see this: `sending to <IPv6:BBjIoxpi> failed: [65: No route to host]`
        - [ ] So the question then becomes: how do we recognize MMS packets/routes, and not  them
        - [ ] WAIT: Is it a packet issue, or DNS issue?
            - [ ] Try: No VPN, but diff DNS
                - [ ] Had to turn off cellular as well, but unless it leaks somehow in this scenario: Setup wifi manually to use 1.1.1.1, turned off vpn, had cellular off - message went through
    - [ ] Deep in Apple API documentation, nothing jumps out
    - [ ] Debug network extension - nope, that’s impossible, need to use logs
    - [ ] Looking at logs again, DNS only has IPv4 avail normally, and both IPv4 and IPv6 available when on our VPN. Let’s cut to just IPv4 DNS in the VPN, and see what happens
        - [ ] AND THAT DOES IT
        - [ ] So now the question is: Why? And what do we lose if we cut this off? Going to talk w/ rest of team.
data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

I didn’t talk to others, instead I dug in a bit deeper. The problem is with PackeTunnelSettingsGenerator's networkSettings.mtu = NSNumber(value: 1280) line. If you change that number to something lower, it works. (Worked for a bit for me at 0, then stopped working. Then worked when I set it to 80. I need to dig in here to see what exactly the trick is.

To be clear: This is another problem within the wireguard-apple library, one which is increasingly seeming as if it’s abandoned.

data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

From end of yesterday: It’s not MTU all of the time…

data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

I feel fairly confident that this seems like an issue that doesn’t come from our side, given the inconsistency. That said: It’s possible we could fix it on our side, if we knew more about it.

Tomorrow’s first things:

data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

I think I’m getting really, really close here.

Here is my working theory:

Where I’m at: A quick attempt at excluding this range on our VPN failed for me. Mullvad VPN doesn’t seem to have this issue, and they say they exclude this relevant range ( https://github.com/mullvad/mullvadvpn-app/blob/master/docs/security.md ) if “exclude local network” is turned on. However, that option doesn’t exist on iOS, so are they just excluding it by default?

Next step: See if I can get it to work on our VPN by successfully excluding this address range. And/or continue looking at Mullvad’s iOS code to see if/where they’re excluding this range on iOS, so I confirm this hunch.

Full notes from today:

    - [ ] Got old iOS 15 device from my partner. Screen is very broken, but working enough for our purposes. Got Mint mobile 7 day test plan.
    - [ ] Regular test
        - [ ] Mint mobile, WIfi, iOS 15.3.1, VPN 2.12 (current App Store)
            - [ ] Seattle
            - [ ] Denver
            - [ ] NYC 
            - [ ] Seattle again
            - [ ] Seattle, flick on and off
            - [ ] ALL THESE WORK
        - [ ] Mint mobile, Wifi, iOS 16.2, VPN 2.12 (current App Store)
            - [ ] Denver - NOPE
            - [ ] Seattle  - NOPE
            - [ ] NYC - NOPE
        - [ ] Trying iOS 15 device again, to confirm it’s probably 16:
            - [ ] Denver, Seattle, NYC all send
            - [ ] But even more worrisome, the iOS 16 device doesn’t receive when it’s on VPN - they deliver after VPN turns off
        - [ ] Trying iOS 16 device w/ Adblock DNS
            - [ ] Still fails
    - [ ] OK, so this feels like an iOS 16 issue - could they have changed MMS DNS in iOS 16???
        - [ ] Can I turn VPN off, but mess w/ DNS settings?
        - [ ] Trying to be on wifi, but putting in the DNS as Adblock (100.64.0.1)
            - [ ] Umm, internet is dead on my phone. WHY???
                - [ ] Turned off “private wifi address” and “limit IP address tracking”. Doesn’t help.
                - [ ] Turn off custom VPN, and ti works. Hmmm…..
                - [ ] Oh of course, that’s not the actual IP address, that’s just how we hit it on their servers
            - [ ] I’m going to try resolving it from the command line
                - [ ] Can use nslookup
                - [ ] Working on resolving a sample domain using my default DNS, and 1.1.1.1. But not 194.242.2.2 (mullvad’s: https://mullvad.net/en/help/dns-over-https-and-dns-over-tls/)
                    - [ ] Does nslookup handle DOH? No, but can use dig
                    - [ ] We get very inconsistent results based on DNS resolver we use:
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 1.1.1.1
                        - [ ] mms.msg.eng.t-mobile.com. 5 IN A 10.188.239.144
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 1.1.1.1
                        - [ ] mms.msg.eng.t-mobile.com. 4 IN A 10.188.239.144
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 1.1.1.1
                        - [ ] mms.msg.eng.t-mobile.com. 3 IN A 10.188.239.144
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 1.1.1.1    
                        - [ ] mms.msg.eng.t-mobile.com. 5 IN A 10.175.85.144
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 1.1.1.1
                        - [ ] mms.msg.eng.t-mobile.com. 4 IN A 10.175.85.144
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 194.242.2.2
                        - [ ] mms.msg.eng.t-mobile.com. 5 IN A 10.168.127.20
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 194.242.2.2
                        - [ ] mms.msg.eng.t-mobile.com. 5 IN A 10.168.127.16
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 194.242.2.2
                        - [ ] mms.msg.eng.t-mobile.com. 2 IN A 10.168.127.16
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 194.242.2.2
                        - [ ] mms.msg.eng.t-mobile.com. 5 IN A 10.188.239.145
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 194.242.2.3
                        - [ ] mms.msg.eng.t-mobile.com. 4 IN A 10.188.239.145
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 194.242.2.3
                        - [ ] mms.msg.eng.t-mobile.com. 2 IN A 10.188.239.145
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 194.242.2.3
                        - [ ] mms.msg.eng.t-mobile.com. 2 IN A 10.188.239.145
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 194.242.2.3
                        - [ ] mms.msg.eng.t-mobile.com. 1 IN A 10.188.239.145
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 194.242.2.3
                        - [ ] mms.msg.eng.t-mobile.com. 5 IN A 10.175.85.143
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 8.8.8.8
                        - [ ] mms.msg.eng.t-mobile.com. 2 IN A 10.168.127.15
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 8.8.8.8
                        - [ ] mms.msg.eng.t-mobile.com. 1 IN A 10.168.127.15
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 8.8.8.8
                        - [ ] mms.msg.eng.t-mobile.com. 5 IN A 10.175.85.156
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 8.8.8.8
                        - [ ] mms.msg.eng.t-mobile.com. 4 IN A 10.175.85.156
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 8.8.8.8
                        - [ ] mms.msg.eng.t-mobile.com. 3 IN A 10.175.85.156
                        - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % dig +noall +answer +multiline mms.msg.eng.t-mobile.com 8.8.8.8
                        - [ ] mms.msg.eng.t-mobile.com. 5 IN A 10.168.127.24
            - [ ] Trying on my device
                - [ ] not getting 194.242.2 or .3 to work as custom DNS on wifi
                - [ ] Can reverse my way into it - can I send w/ VPN on, using VPN DNS 1.1.1.1 on iOS 16?
                    - [ ] OK, VPN on it’s normal DNS, and it fails sending.
                    - [ ] OK, VPN on 1.1.1.1 custom DNS in NYC, and it successfully sends
                        - [ ] Same thing, but Seattle - it fails
                        - [ ] Same thing, but in Los Angeles - it successfully sends
                        - [ ] Same thing, in San Jose - it sends
                        - [ ] Seattle again - it sends
                    - [ ] Turning back to normal DNS, on iOS 16.2
                        - [ ] In seattle - failed
                        - [ ] Los Angeles - failed
                        - [ ] NYC - failed
                        - [ ] San Jose - it successfully sent 
                        - [ ] Seattle again - it succesfully sent
                    - [ ] THEN TRY WITH 8.8.8.8
                        - [ ] Seattle - it failed
                        - [ ] San Jose - it failed
                        - [ ] NYC - it failed
                        - [ ] Los Angeles - it worked
                        - [ ] Seattle - it worked
                    - [ ] THEN TRY WITH 1.1.1.1 - ALL FOUR OF THESE WOREKD GREAT
                        - [ ] Seattle
                        - [ ] San Jose
                        - [ ] NYC
                        - [ ] Los Angeles
            - [ ] But could iOS route all MMS to cellular?  What happens when I turn off cellular? That turns off cell data - can’t turn off SMS/phone it seems, so MMS is unknown.
            - [ ] DECLARATION TIME: With 95% confidence, this is a DNS issue. 
                - [ ] Many DNS servers don’t have proper resolution for this domain.
                - [ ] Get in touch w/ them and ask why.
            - [ ] Looking at MMS in logs - some key lies in both
                - [ ] DataConnectionAgentInterfaceObserver doUpdateInterface:plainParameters:domainParameters: 
                    - [ ] looks like it’s sending via cellular, using interface pdp_ip2
                - [ ] default   11:42:50.286809-0800    CommCenter  #I MMS Sending MMS PDU of length 540 URL:-<private>
                - [ ] default   11:42:50.286854-0800    CommCenter  #I MMS UAProf = http://iphonemms.apple.com/iphone/uaprof-2MB.rdf
                - [ ] MMS http response:
                - [ ] MMS send
                - [ ]  computeMTU_sync: adjusted for XLAT464
            - [ ] What is default DNS on mullvad vpn?
            - [ ] But also, this is new in iOS 16. Did iOS not formerly use the VPN’s DNS (or the VPN at all) of MMS? 
            - [ ] Oh no, new information I realized
                - [ ] Wait… these IPs are all in the LAN range (https://en.wikipedia.org/wiki/Private_network), so will only matter when you’re on t-mobile’s network.
                - [ ] Hey, T-mobie is a Tier 1 network (https://en.wikipedia.org/wiki/Tier_1_network), so that could be providing weird behavior in this situation, possibly
                - [ ] But why would the DNS server determine which network we’re on??? Is there something else going on here?
                    - [ ] OK, just messed w/ 1.1.1.1 as DNS again. And I got af failure when on Phoenix. So it’s not consistent. 
            - [ ] OK, here is my working theory: 
                - [ ] Tmobile’s MMS server is mms.msg.eng.t-mobile.com. DNS always resolves it an internal network address. A few different IP addresses, but all in the 10.x.x.x range, which is reserved for internal networks.
                - [ ] MMS is always sent on cellular.
                - [ ] Our VPN is throwing these packets in a tunnel before throwing it onto cellular, and so it’s not on the T-mobile cell network.
                    - [ ] [this doesn’t make much sense - can’t the VPN only be active on wifi OR on cellular?]
                - [ ] Sometimes, we end up on networks that t-mobile controls, and it all works fine. Sometimes, we don’t. Hence the inconsistency on whether it works.
                - [ ] Prior to iOS 16, MMS wasn’t sent via the data network - it was always sent via cellular, and not through the VPN
            - [ ] Looking at logs, given this theory
                - [ ] From successful run
                    - [ ] default   11:42:49.793837-0800    CommCenter  #I -[DataConnectionAgentInterfaceObserver doUpdateInterface:plainParameters:domainParameters:]: NWAgent F57F425E-B5B4-4636-8E47-68A42FA55EA0 DataConnectionAgentInterfaceObserver path domain = Cellular, type = MMS: updating interface to pdp_ip2 (pdp_ip2)
                    - [ ] default   11:42:49.793962-0800    CommCenter  #I -[DataConnectionAgentInterfaceObserver doUpdateInterface:plainParameters:domainParameters:]: NWAgent F57F425E-B5B4-4636-8E47-68A42FA55EA0 DataConnectionAgentInterfaceObserver agent domain = Cellular, type = MMS: updating interface to pdp_ip2 (pdp_ip2)
                    - [ ] default   11:42:50.295236-0800    CommCenter  [C148 4C9BA8FC-6407-4AE0-A48A-BEA66B6BF60E Hostname#24af403e:80 tcp, bundle id: com.apple.datausage.telephony.mms, url hash: e069e359, definite, attribution: developer, context: com.apple.CFNetwork.NSURLSession.{5A2824D7-2716-4784-98D1-E22A0FFCFD3F}{(null)}{Y}{2} (private), proc: 01DF72A1-ACFA-33E9-9852-031CF0B3760D, effective proc: 569F2700-CAF9-483D-B19D-7AF406DCE15E, no proxy, account id: 0c8b3050, required netagent domains: Cellular, required netagent types: MMS] start
                    - [ ] NonCellular
                - [ ] Filtering on `DNS service (`, the successful run is IPv6 only at lower levels, when unsuccessful is both IPv4 and IPv6 - could this somehow be part of the issue?
            - [ ] Next research steps
                - [ ] Does MMS always go over cellular data, never wifi?
                    - [ ] According to sketchy internet sources, yes:
                        - [ ] https://piunikaweb.com/2020/10/24/heres-why-sending-receiving-mms-on-wifi-doesnt-work-for-many-android-users-in-us-as-well-as-some-workarounds/
                        - [ ] https://www.quora.com/Can-I-send-MMS-messages-if-wifi-and-mobile-data-is-on
                - [ ] 2. look at logs of outgoing messages on iOS 15 - that would prove if it uses VPN or not!!!!
                    - [ ] Very different log style than iOS 16, very limited info
            - [ ] Next tests
                - [ ] Can we exclude the IP range / local networks? And would it consistently work then?
                    - [ ] Trying excludeLocalNetworks
                        - [ ] Well, that failed. Bummer. 
                    - [ ] Would be good to be able to get more granular
                        - [ ] Excluding the 10.x.x.x range in our VPN. Still fails on VPN and wifi
                        - [ ] What about VPN and cellular?
                            - [ ] Tried one, It failed
                - [ ] Why does this never happen on mullvad? Are they excluding things?
                    - [ ] 3 out of 3 (in different US locations) went through on mullvad
                    - [ ] So what is their setup like?
                        - [ ] Well, this could be it - https://github.com/mullvad/mullvadvpn-app/blob/master/docs/security.md
                        - [ ] But I want to find where in their code this happens
                - [ ] What happens if we go IPv6 only in our tunnel?
data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

Short version from Wednesday: I can see exactly where DNS is failing in the iOS system logs. I’m not 100% sure why yet, though. And it does seem to be a difference between iOS 15 and iOS 16, so this is likely new.

Never posted yesterday’s notes:

    - [ ] Looking at logs in thread view for a successful send
    - [ ] On VPN, got a successful one
    - [ ] 5 mms connections were made
        - [ ] C216 is a failure
        - [ ] Which one was good? 
    - [ ] OK, so connection in the logs comes in style of “Connection 148” or “C148”, which have children. And C148.1 is a DNS lookup, and possibly C148.1.1 is presumably the actual data request.
        - [ ] Successful ones have C148 and C148.1 with something like `Hostname$24af403e:80` in the log line, then C148.1.1 with has `IPv6#facf4a8d`. I’m presuming that Hostname is our VPN, and IPv6 is possibly the cell network.
        - [ ] Failing ones have C148 and C148.1, and there is a “event: resolver:start_dns” as we’d expect, but there is not a “event: resolver:receive_dns” like we have in our successful ones.
        - [ ] When failing DNS, we get interesting DNS logs
            - [ ] 14:31:55.653131-0800  dnssd_server    com.apple.mdns  [R26595] getaddrinfo start -- flags: 0xC000D000, ifindex: 5, protocols: 0, hostname: <mask.hash: 'KDfbuH1u/7Q5YDuVt91Lvw=='>, options: 0x8 {use-failover}, client pid: 94 (CommCenter), delegator uuid: 569F2700-CAF9-483D-B19D-7AF406DCE15E    mDNSResponder   0xf5ce1 143 mDNSResponder
            - [ ] 14:31:55.653196-0800  Default com.apple.mDNSResponder [R26595->Q25825] Question for <mask.hash: 'H/O5SjtxXCvArkzb0BSJFA=='> (HTTPS) assigned DNS service 2676 mDNSResponder   0xf5ce1 143 mDNSResponder
            - [ ] 14:31:55.653249-0800  Default com.apple.mDNSResponder [R26595->Q19899] Question for <mask.hash: 'H/O5SjtxXCvArkzb0BSJFA=='> (AAAA) assigned DNS service 2676  mDNSResponder   0xf5ce1 143 mDNSResponder
            - [ ] 14:31:55.653299-0800  Default com.apple.mDNSResponder [R26595->Q42484] Question for <mask.hash: 'H/O5SjtxXCvArkzb0BSJFA=='> (Addr) assigned DNS service 2676  mDNSResponder   0xf5ce1 143 mDNSResponder
            - [ ] 14:31:55.653348-0800  Default com.apple.mDNSResponder [Q42484] ShouldSuppressUnicastQuery: Query suppressed for <mask.hash: 'H/O5SjtxXCvArkzb0BSJFA=='> Addr (A records are unusable) mDNSResponder   0xf5ce1 143 mDNSResponder
            - [ ] 14:31:55.653409-0800  Default com.apple.mDNSResponder [R26595->Q42484] GenerateNegativeResponse: Generating negative response for question <mask.hash: 'H/O5SjtxXCvArkzb0BSJFA=='> (Addr) mDNSResponder   0x99e   143 mDNSResponder
            - [ ] 14:31:55.653707-0800  resolver    com.apple.mdns  sending to <IPv6:BBMAalbc> failed: [65: No route to host]   mDNSResponder   0xf5ce1 143 mDNSResponder
            - [ ] 14:31:55.653788-0800  resolver    com.apple.mdns  [Q25825] Sent 39-byte query #1 to <IPv6:BBMAalbc> over UDP via pdp_ip2/5 -- id: 0x9357 (37719), flags: 0x0100 (Q/Query, RD, NoError), counts: 1/0/0/0, BBjBDeuK IN HTTPS?   mDNSResponder   0xf5ce1 143 mDNSResponder
            - [ ] 14:31:55.653949-0800  resolver    com.apple.mdns  sending to <IPv6:BBMAalbc> failed: [65: No route to host]   mDNSResponder   0xf5ce1 143 mDNSResponder
            - [ ] 14:31:55.654026-0800  resolver    com.apple.mdns  [Q19899] Sent 39-byte query #1 to <IPv6:BBMAalbc> over UDP via pdp_ip2/5 -- id: 0x1CD0 (7376), flags: 0x0100 (Q/Query, RD, NoError), counts: 1/0/0/0, BBjBDeuK IN AAAA? mDNSResponder   0xf5ce1 143 mDNSResponder
            - [ ] 14:31:55.709527-0800  resolver    com.apple.mdns  [Q6335] Received acceptable 87-byte response from <IPv6:BBMAalbc> over UDP via pdp_ip2/5 -- id: 0x64FD (25853), flags: 0x8180 (R/Query, RD, RA, NoError), counts: 1/2/0/0, ipv4only.arpa. IN AAAA?, 1604 IN AAAA BBivRoOA, 1604 IN AAAA BBUTWzib    mDNSResponder   0xf5ce1 143 mDNSResponder
            - [ ] 14:31:55.709667-0800  Default com.apple.mDNSResponder [Q6335] Handling concluded querier: ipv4only.arpa. AAAA IN  mDNSResponder   0xf5e04 143 mDNSResponder
            - [ ] 14:31:55.709789-0800  Default com.apple.mDNSResponder [R26594->Q6335] DNSServiceGetAddrInfo(<mask.hash: 'Pp9pweoPsCQb2qMeO3m+Lg=='>, AAAA) RESULT add interface 0: (mortal, DNSSEC Indeterminate)<mask.hash: 'dM1lzvTm1zyBMTWAhAHkQg=='>  mDNSResponder   0xf5e04 143 mDNSResponder
        - [ ] More DNS stuff from deep logs
            - [ ] I see things in logs like `IPCONFIGURATION_INTERFACE_TYPE_CELLULAR`, so it’s trying to go to cellular as we’d expect
            - [ ] “ipv6ServiceChanged” in both successful and failure logs, so hmmmm
            - [ ] “PLAT discovery complete” lines seem relevant, but identical in each
            - [ ] In failing logs but not successful, but currently I’m thinking it’s a red herring, but I do want to note: Setting interface pdp_ip2 expensive cost flag to true, active SIM kTwo
            - [ ] This seems interesting (from failing one, and successful one had similar but different): updateStateCache_sync: Context 2: familyActive=kDataProtocolFamilyIPv6(2), familyAvailable=kDataProtocolFamilyIPv6(2), active=true
            - [ ] DIFFERENT ON SUCCESS AND FAILURE LOGS, AS FAILURE ALSO HAS IPV6: [corewifi] SCNetworkConfiguration event: keys=(
            - [ ] OK, how we do DNS is definitely different
                - [ ] success: 
                    - [ ]   configd network changed: v4(en0:192.168.0.208, pdp_ip0) DNS* Proxy
                    - [ ] {IP-} IPv4 Primary interface is en0 and IPv6 Primary Interface is (null)
                    - [ ] *** Network Configuration Change *** SC key: State:/Network/Interface/pdp_ip2/IPv4
                - [ ] Failure
                    - [ ]   configd network changed: v4(utun6:10.104.28.47, en0, pdp_ip0) v6(utun6:fc00:bbbb:bbbb:bb01::29:1c2e, pdp_ip0) DNS* Proxy
                    - [ ] IPv4 Primary interface is utun6 and IPv6 Primary Interface is utun6
                    - [ ] *** Network Configuration Change *** SC key: State:/Network/Interface/pdp_ip2/IPv6
                - [ ] FWIW, I can send a message on cellular (which is IPv6 only)
                - [ ] OK, major differences within mDNSresponder - our good one lock on R71937 and bad one lock on R71858 and all of their Q subprocesses
                    - [ ] For bad one
                        - [ ] Q62917
                            - [ ] Query failed, was asking for HTTPS
                        - [ ] Q52583
                            - [ ] Query failed, was asking for AAAA
                        - [ ] Q24618
                            - [ ] Query suppressed, as was asking for A record and A record are unusable (it knows no IPv4)
                        - [ ] Q16341
                            - [ ] I’m a little unclear on this one - seems to have maybe worked, but with A records
                        - [ ] Q48769
                            - [ ] Query suppressed, as was asking for A record and A record are unusable (it knows no IPv4)
                    - [ ] For good one, just one Q: 
                        - [ ] Q7188
                            - [ ] Query succeeded, was asking for AAAA
                    - [ ] These Qs end either with `GenerateNegativeResponse` or `Handling concluded querier` - and both show up in the iOS 15 logs
                - [ ] Locking on he line that starts with “question for” and includes “interface properties” to see which interface it’s using
                    - [ ] NOthing useful was here
                                - [ ] Theory: Could it be that the DNS simply fails in some cases? Should the DNS call have been sent not over our VPN? Possibly.
            - [ ] BIG THEORY, LITTLE EVIDENCE - WHAT IF THERE IS NO IPv6 DNS lookup for our address
            - [ ] Things to research:
                - [ ] What does it look like on iOS 15?
                    - [ ] See work above, need to coninue
                - [ ] Test our VPN w/ only IPv6 DNS - does that fix it? What do those logs look like in the pertinent area
                    - [ ] Change line to 
                        - [ ] interface.dns = [DNSServer(address: ipv6GatewayIP!)] 
                    - [ ] Turn on VPN. web works on wifi, not cellular
                    - [ ] VPN on, on Wifi, try sending MMS: SUCCESS
                        - [ ] Works in Seattle, twice
                        - [ ] Works in San Jose
                        - [ ] Works in Salt Lake City
                        - [ ] Works in Madrid
                    - [ ] Put line back to 
                        - [ ] interface.dns = [DNSServer(address: dnsServerIP!), DNSServer(address: ipv6GatewayIP!)]
                    - [ ] VPN on, no Wifi, try sending MMS
                        - [ ] Work in Madrid
                        - [ ] Work in Salt Lake City
                    - [ ] WHAT DOES ANY OF THIS MEAN ANYMORE????
                        - [ ] DO BETTER TESTING OF THIS WITH A CLEAR MIND IN THE MORNING.g
                - [ ] How does it decide whether to query DNS for IPv4 or IPv6?
                - [ ] Look more at iOS 15 logs.
                - [ ] Look more in these logs, in detail around DNS stuff
data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

While I didn’t have as much time to think on this today, what I did find was an inconsistent mess. I thought I had some ideas as to when these MMS messages were sent through, but ultimately they didn’t work when it’s on a fresh install of the app. I’m not sure why they were so regularly (if inconsistently) working before.

Hoping I have some strokes of insight soon. Will book some time to talk things through w/ someone next week.

Full notes:

            - [ ] Testing
                - [ ] W/o VPN, on wifi, works fine
                - [ ] Tried about a half dozen US locations on VPN on `main` (on wifi) - it works every time w/ VPN on
                - [ ] 2.12.0 from App Store - it fails on one location
                - [ ] Is it about 2.12, or the different build?
                - [ ] Trying 2.12, built locally
                    - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % git checkout v2.12.0                
                    - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % git submodule update
                    - [ ] mcleinman@mcleinman-37536 mozilla-vpn-client % ./scripts/macos/apple_compile.sh ios
                    - [ ] And this fails twice, in 2 different cities
                - [ ] Looking at diff between main and tag v2.12.0 - nothin leapgs out here
                - [ ] Pull main again and submodule update and… now main isn’t working, either. 
                - [ ] tried a clean (which fails, but tried anyway because it partially cleans) and build 
                    - [ ] Fails as well!
                - [ ] WHY COULD THIS BE SO INCONSISTENT? 
                    - [ ] Could something else be going on? Something about how we save the VPN settings? try removing the VPN from settings and app from device between tests
                        - [ ] Fresh install on 2.12 from App Store
                            - [ ] Seattle, failure a couple times
                        - [ ] Main branch (fresh install of VPN and app)
                            - [ ] Seattle, failed
                        - [ ] Main branch w/ only IPv6 DNS
                            - [ ] Seattle, failed
                    - [ ] OK, why was this working in some cases before?????
data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

I have a fix, but it’s a huge hammer: Keeping iOS’s includeAllNetworks on by default is what causes this. This makes intuitive sense (in a way that makes me somewhat embarrassed this took me so long) - it’s trying to hit an internal network for the MMS server. (This also explains why MMS works fine on Mullvad’s VPN app; they do not use this feature.)

Options -

  1. I believe - though am not sure - that we previously had this feature enabled as a “kill switch” option (as Proton VPN does it). We may want to move back to that.
  2. While excludeLocalNetworks doesn’t allow this traffic to pass through, it’s likely we could exclude a range of IP addresses that would allow MMS to go through. I’m not sure we’d want to do this without informing the user, and potentially getting permission.

Related notes:

Full notes from today:

    - [ ] Try again setting a different DNS on wifi - does it work? (Or is it possible iOS DNS subsystem is sending it elsewhere when we’re not on VPN.) (Was tmobile using it’s own DNS resolver?)
        - [ ] DNS lookup 
            - [ ] Wifi w/o VPN - IPv4 10.x.x.x
            - [ ] Cell w/o VPN - fails
            - [ ]  Wifi w/ VPN - IPv4 10.x.x.x
            - [ ] Cell w/ VPN - fails
        - [ ] OMG I AM ALMOST DEFINITELY GETTING STYMIED BY LOCAL DNS CACHING ISSUES!!!
            - [ ] It worked like 6 times in a row on VPN, but had started this morning’s run off VPN. And then I toggled airplane mode (to clear cache), and it immediately did *not* work on VPN.
            - [ ] Corrolary: I cannot trust *ANY* of my prior tests. 
            - [ ] Now to try: keep flushing, and see if it’s more deterministic than I realized. Yes, it is. Facepalm facepalm
                - [ ] Wifi w/o VPN - 2 work
                - [ ] Cell w/o VPN - 1 work
                - [ ] Cell w/ VPN - 1 not work
                - [ ] Wifi w/ VPN - 2 not work
            - [ ] Trying to find TTL for this record, to confirm this hunch. If it’s like 20 seconds I’d be off, but if it’s something normal like 5 min this is the confirmation we needed.
                - [ ] However, can’t find canonical name server for this domain w/ dig: ` dig NS mms.msg.eng.t-mobile.com` doesn’t work
                - [ ] Oh this is screwy: https://www.sprint.net/faq/dns
                    - [ ] Cache servers return an entry, but the other ones do not
        - [ ] At this point: I feel confident this is a DNS issue.
        - [ ] From caht w/ Owen
            - [ ] USE SUBDOMAINS FOR TMOBILE DNS TO GET TTL
                - [ ] msg.eng.t-mobile.com has a TTL of 500 seconds, or 8.3333 minutes. Which makes sense
            - [ ] IS MULLVAD USING THEIR OWN DNS OR DEVICE DNS?
                - [ ] I believe they’re doing same thing we are, but it definitely works whether I’m using custom DNS or their DNS.
            - [ ] CAN WE HIT OTHER IPv6 ONLY SERVERS ON OUR VPN?
                - [ ] yes
            - [ ] Does it work on tmobile’s DNS cache server
        - [ ] Afernoon work
            - [ ] Look at logs of it going out w/o VPN on IPv6 cellular and iOS 15 and everything else
                - [ ] Got 3 fresh logs:
                    - [ ] Cell w/ no VPN
                    - [ ] Cell w/ Mozilla VPN
                    - [ ] Cell w/ Mullvad VPN
            - [ ] DNS services (6 of them) are identical for Mullvad and Mozilla VPN logs, which is good
            - [ ] In working and mullvad, we get this line in a dns lookup: `getaddrinfo result -- event: add, ifindex: 0, name: BBSbsUEY, type: A, rdata: <none>, reason: query-suppressed`. It doesn’t exist in ours. This is the fallback to IPv4. Why???
                - [ ] And shortly before that, we get “Generating negative response for question“
            - [ ] For all three logs, lock on `dnssd_server`. In the 2 successful ones, this is the DNS retry lookup (on IPv4) that works. On our failure one, we get the first line but we don’t get any results. What happens after these fire?
                - [ ] All of these re tagged with a log number in the format `[R00000]`. Each have 2 R numbers in the logs. These in turn spin up a log of the format `[Q00000]`.
                - [ ] Second R number has identical logs for all 3 logs.
                - [ ] First R number is where the DNS failure happens.
                - [ ] Let’s dig into the Q numbers that this R number spins up. 
                    - [ ] So these Q numbers are where we get the “reason: query-suppressed” thing for the 2 successful ones. And that is spun up by a specific q number, that also part of a “GeneratingNegativeResponse” log line that includes the R number, and that is in all 3 logs. So we lock on that q number.
                        - [ ] Oh, the mask hash is identical in all3 o these logs, s o that’s good.
                        - [ ] The 2 successful ones show 4 lines with this log, the penultimate one being “generating negative response” and the final one being `getaddrinfo result -- event: add, ifindex: 0, name: BBSbsUEY, type: A, rdata: <none>, reason: query-suppressed`. However, our failure Mozilla VPN one has 5 lines - after the Generating negative response, we get these 2. The first one is basically a duplicate of one that all 3 have (but this is the only one to have two). What is CLAT46 though?
                            - [ ]   [Q30881] ShouldSuppressUnicastQuery: Query suppressed for <mask.hash: 'EZ6FYdK0KV1ITYxlPPlNkw=='> Addr (A records are unusable) mDNSResponder   0x1445b2    139 mDNSResponder
                            - [ ]   [Q30881] ShouldSuppressUnicastQuery: Query suppressed for <mask.hash: 'EZ6FYdK0KV1ITYxlPPlNkw=='> Addr (CLAT46 A records are unusable)
                        - [ ] Looked at CLAT64 for a few.    
                        - [ ] Locking in on `resolver   com.apple.mdns`
                            - [ ] It looks substantially identical for Mullvad and no-VPN: 2 servers confirmed usable on IPv6, and then 4 queries sent that all receive acceptable responses.
                            - [ ] However, on Mozilla VPN: Both servers are confirmed usable on IPv6, but 4 packets fail because `[65: No route to host]`
                        - [ ] OK, thanks to a developer forum post (https://developer.apple.com/documentation/networkextension/nevpnprotocol/3131931-includeallnetworks), I know why it works on mullvad: they don’t include `includeAllNetworks`. And proton only includes that if the user has activated “kill switch” mode. That’s the fix, but it’s a decent sized hammer. Going to spend a few more minutes in these logs.
                            - [ ] Yup, that’s fully it. You can see network changing on Mullvad VPN (lock on `network changed` in logs), which is not present in Mozilla VPN or no-VPN for obvious reasons.
                        - [ ] I dug in a bit more. We immediately leap to `resolver com.apple.mdns  sending to <IPv6:BBRSEeHt> failed: [65: No route to host]`.
                            - [ ] It sure seems like something changed under the hood between iOS 15 and iOS 16 here. (But I can no longer test as free test of cell service is over, and need Jordan’s iCloud PW to unlock her old phone that I’m using to test here.) I searched  in 2022 WWDC videos again (nothin), and searched more broadly for changes.
                        - [ ] excludeLocalNetworks does not help
data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

On agenda to discuss in next Tuesday’s engineering meetingg.

elenzil commented 1 year ago

Hello, everyday user here with two minor notes about symptoms which differ from what I see above.

  1. For me, receiving group SMS messages works fine, it is only sending group SMS messages which fails.
  2. Group SMS messages continue to fail to send even after turning Mozilla VPN off. I need to reboot the device to get back to a usable state. I am not sure if I need to first have a failed send to get into this unrecoverable state, or if merely turning the VPN on and then off again is sufficient.

This is very reproducable for me. Consistent under all DNS options.

Mozilla VPN 2.12.0, iOS 6.2, iPhone 14 pro.

mcleinman commented 1 year ago

@elenzil Thanks for the additional info! We've isolated the issue, and are currently determining the best path forward.

A couple notes: 1) The DNS lookup is cached by the phone, and so after you turn off the VPN you'll see the same behavior until the prior query expires (in a few minutes). One way to force a flush of the cache (and thus get the send to work) is to reboot the phone (as you've discovered), another way is to toggle airplane mode off and back on. (I definitely wish there was an even simpler way, but that is the best way I know.) 2) I'd also expect that sending picture messages would have the same issue.

elenzil commented 1 year ago

got it. thanks for the extra details!

data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

From Apple Developer Forums, a reply from someone w/ Apple Developer Technical Support:

The fact that MMS fails when you set includeAllNetworks doesn’t surprise me. You’ve specifically told the system that you want all network traffic, which is at odds with standard cellco practice of requiring that MMS be delivered over the cellco’s network.

I’m going to submit a feature request to Apple for a new flag. We already have an optional excludeLocalNetworks flag that only matters when includeAllNetworks is active. An additional excludeMMS would also be very helpful. Of course, even if this was to go somewhere, it would likely be quite a long time, and so we still need a workaround in the meantime.

data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

Another reply from the same Apple employee was a “no” when asked if they “know of a way to allow MMS to go outside the VPN while otherwise keeping similar functionality to includeAllNetworks”.

This is as official of a reply as we’ll likely get - we’re likely not missing anything here, and need to make do with the knowledge that we have.

Separately, the feature request to allow a “let MMS bypass VPN” flag has been submitted to Apple: FB11933077 (VPN: Allow MMS to work when includeAllNetworks is active) ( https://feedbackassistant.apple.com/feedback/11933077 ). I believe that link should be viewable to other members of our Apple development team. I’d expect a response to come in the distant future.

data-sync-user commented 1 year ago

➤ Santiago Andrigo commented:

Closing this as “Won’t Fix”. This topic will resurface when I create a Story about supporting user’s ability to disable the kill switch.

data-sync-user commented 1 year ago

➤ Santiago Andrigo commented:

We will solve this through a new feature to be spec'd out in future sprints.

data-sync-user commented 1 year ago

➤ Bianca Hidecuti commented:

Hello, I verified this while using the 2.11.1 / 2.14 Mozilla VPN versions, on iOS 16.2, but I was not able to reproduce it.

I am able to successfully send / receive MMS messages while the VPN is ON.

→ tried while being connected to the Wi-Fi and also via mobile data.

→ cell carrier: Orange (Romania).

data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

This is now fixed, as of iOS 16.4: https://developer.apple.com/documentation/networkextension/nevpnprotocol/4140517-excludecellularservices ( https://developer.apple.com/documentation/networkextension/nevpnprotocol/4140517-excludecellularservices|smart-link )

Any users reporting this issue should be encouraged to update to iOS 16.4 or later.

data-sync-user commented 1 year ago

➤ Santiago Andrigo commented:

Oh! That’s fantastic. There is nothing for us to do here, Matt Cleinman ?

data-sync-user commented 1 year ago

➤ Matt Cleinman commented:

Correct, the new flag defaults to true, and so MMS will be excluded from the VPN automatically.

Important to note, for high privacy users: This also excludes WiFi calling and Visual Voicemail from the VPN (we can’t get more granular and have MMS be separate from these), and so some other traffic will also go through the normal network. While this seems fine on cellular - your cell provider already knows your IP address, by definition - this will have some minor leakage on Wifi.