Closed data-sync-user closed 1 year ago
➤ Santiago Andrigo commented:
This is a fairly bad experience as it seems to brick the device until reboot, so upping this to High, at least until we figure out the root cause.
Magdalena Schwaighofer Can you confirm the user is able to use the internet as per normal if they don’t attempt the connect via the Mozilla VPN?
➤ Magdalena Schwaighofer commented:
no problem I will reach out to the user! It seems like we had 2 other customers reporting the same issue by now.
Juan Zapata and Gustavo Aguilera do you have further info on your cases?
➤ Owen Kirby commented:
Digging through the logs, it seems that the Windows kernel is reporting that the tunnel interface doesn’t exist when trying to populate the routing table and returns ERROR_NOT_FOUND (1168). Somehow this seem to suggest that we couldn’t figure out the LUID of the tunnel device during service bringup.
As a secondary issue, this means that our error handling in this case is insufficient and we leave the device in an unworkable state.
➤ Betty Fleming commented:
Not considered a blocker for 2.14
➤ Santiago Andrigo commented:
We are not considering this a blocker because a) it’s support low b) as per owen, we don’t think this is a regression in 2.13/2.14 so the support volume speaks to something that is not very prevalent. But we do want to keep it in high as the severity is pretty bad.
➤ Lesley Norton commented:
Reminds me of https://mozilla-hub.atlassian.net/browse/VPN-1389 ( https://mozilla-hub.atlassian.net/browse/VPN-1389|smart-link )
➤ Magdalena Schwaighofer commented:
We have received logs from a user experiencing this problem on version 2.14
[^mozillavpn-2023-4-21_1.txt]
➤ Magdalena Schwaighofer commented:
another set of logs from a user who initially reported this on February 6th first (on version 2.13) they updated to version 2.14 and the same problem still occurs
[^mozillavpn-2023-4-25.txt]
➤ Magdalena Schwaighofer commented:
recent report of customer experiencing this on 2.14. see log file below
[^mozillavpn-2023-5-16.txt]
➤ Santiago Andrigo commented:
Basti do you have any thoughts on this one?
➤ Basti commented:
That’s an “easy” one -
[15.05.2023 18:50:12.042] (WireguardUtilsWindows) Debug: Configuring peer XXXXXXXX via 68.235.44.2 [15.05.2023 18:50:12.043] (WireguardUtilsWindows) Debug: DATA: errno=0 [15.05.2023 18:50:12.043] (DnsUtilsWindows) Debug: Configuring DNS for MozillaVPN [15.05.2023 18:50:12.044] (WireguardUtilsWindows) Error: Failed to create route to XXXXXXXX result: 1168
From the logs you can see we enable the kill switch - but we fail to edit the route table, so vpn traffic actually will happen. My guess is, if that error happens, we fail to disable the killswitch, which blocks all traffic until our handle is lost, which happens at a reboot.
The general quesiton of “why the connection fails for this person” - no clue 😄
➤ Santiago Andrigo commented:
Can we make this more fault tolerant and catch the error and undo the killswitch / tunnel?
➤ Juan Zapata commented:
[^mozillavpn-2023-5-18-1.txt]
Attaching logs from another affected user
➤ Valentina Virlics commented:
As QA was not able to reproduce this on previous versions, we cannot check the fix for this ticket.
➤ Santiago Andrigo commented:
Basti Can you describe your fix here? Just curious. How would this be handled?
Marking this as Done and adding qa-not-actionable as a label. Hopefully if this creates regressions, we’ll notice in regression testing.
➤ Basti commented:
Santiago Andrigo sure. The windows daemon has multiple “components” we need to activate in order to get to a windows connection: -> Wintun (create a network device) -> Wireguard (config that network device) -> Firewall ( make sure programs cannot self route) -> Routing ( tell windows to route everything to that adapter
The problem here is that we activate all of those components on activation and deactivate on deactivation. Now if one of those components detects an error, we abort the activation but that is not a “deactivation”, so all components we activated before still are active.
In this specific case, the firewall rules are enabled and we have aborted. Which results in a complete loss of internet, unless the vpn is on (aka the killswitch)
The solution is easy, just cross propagate an error signal between the components, so if we abort due to one error just ask all the other actors to tear down whatever they have.
➤ Santiago Andrigo commented:
Fantastic, thanks Basti!
Hi there,
This specific user is having an issue with the VPN on Windows 10, VPN version 2.13 which we are unable to resolve with basic troubleshooting:
They activate the VPN by moving the toggle to the right, the toggle changes color and a small timer icon floats above it for about 1 second. Then the toggle moves back to the off position. There is no error message or any other indication from the VPN menu. Everything stops. Also, their PC is locked out of the internet connection. They cannot access the internet with FireFox or any other browser. The network connection icon at the bottom right of the screen shows the internet connection is off. The only way to restore the internet is to reboot the PC.
The app remains in off state:
!image-20230227-090010.png|width=179,height=326!
We have attempted the following steps with the customer:
Reinstall the app (signed out first)
Full reset of app by deleting the appdata folder (app was reinstalled with administrator rights)
check for any system updates on Windows 10 - all up to date
Run the app as administrator
Check Windows Firewall and all third party systems (antivirus, security software) to add an exclusion for the VPN, those same programs were also temporarily disabled completely
Check for other VPNs on device - not installed
They also temporarily uninstalled Malwarebytes
We tried another reset using the developer option of the app
We also checked if the mozilla vpn broker is running in the task manager (user feedback: The Mozilla VPN (broker) service was on & I restarted it; It was set at automatic; The Mozilla VPN (tunnel) was not on & I started it; It was set at manual; All in all, I get the same result.)
Logs attached
[^mozillavpn-2023-2-14 (2).txt]
Thank you for your help!
┆Issue is synchronized with this Jira Bug ┆Reporter: Magdalena Schwaighofer