mozilla-mobile / mozilla-vpn-client

A fast, secure and easy to use VPN. Built by the makers of Firefox.
https://vpn.mozilla.org
Other
474 stars 114 forks source link

Update from Mozilla VPN 2.10 to 2.11 not working on Production server #4937

Closed data-sync-user closed 1 year ago

data-sync-user commented 2 years ago

Affected versions:

Tested Platforms:

Prerequisites:

Steps to reproduce:

  1. Go to Settings - About us and click the “Check for updates” button;
  2. Observe the behavior;

Expected result:

Actual result:

Notes:

┆Issue is synchronized with this Jira Bug ┆Reporter: Bianca Hidecuti

data-sync-user commented 2 years ago

➤ Bianca Hidecuti commented:

update: I am also able to reproduce it on MacOS while using the 2.10 build (Prod sever). On MacOS I was able to see the “Update available” modal after turning ON the VPN.

!Screen Recording 2022-11-15 at 13.50.27.mov|width=414,height=686!

data-sync-user commented 2 years ago

➤ Sarah Bird commented:

There’s a discussion with the releng team here: https://mozilla.slack.com/archives/C01DCUKG95E/p1668538988105999 ( https://mozilla.slack.com/archives/C01DCUKG95E/p1668538988105999|smart-link )

It is expected that balrog may return a cached value (so no update) for up to 24 hours after the change.

Please check again on the morning of Nov 16 Romanian time. If you are still consistently getting this issue then there’s a problem.

data-sync-user commented 2 years ago

➤ Bianca Hidecuti commented:

Sarah Bird, the issue is still reproducible while using the 2.10 version → “You are up to date” modal is displayed.

The update from 2.9 to 2.11 works as expected.

!bandicam 2022-11-16 10-30-33-928.mp4|width=464,height=828!

[^mozillavpn-2022-11-16.txt]

data-sync-user commented 2 years ago

➤ Jon Buckley commented:

https://aus5.mozilla.org/json/1/FirefoxVPN/2.10.0/WINNT_x86_64/release/update.json ( https://aus5.mozilla.org/json/1/FirefoxVPN/2.10.0/WINNT_x86_64/release/update.json ) is the url that VPN v2.10.0 is hitting. From the VPN logs the problem seems to be signature verification:

[16.11.2022 10:30:08.263] Debug: (networking - Updater) Updater created [16.11.2022 10:30:08.265] Debug: (networking - Balrog) Balrog created [16.11.2022 10:30:08.265] Debug: (networking - Balrog) URL: https://aus5.mozilla.org/json/1/FirefoxVPN/2.10.0/WINNT_x86_64/release/update.json [16.11.2022 10:30:08.265] Debug: (networking - NetworkRequest) Network request created by TaskRelease [16.11.2022 10:30:08.579] Debug: (networking - NetworkRequest) Network header received [16.11.2022 10:30:08.579] Debug: (networking - NetworkRequest) Network reply received - status: 200 - expected: 200 [16.11.2022 10:30:08.579] Debug: (networking - Balrog) Request completed [16.11.2022 10:30:08.579] Debug: (networking - Balrog) Fetching x5u URL: https://content-signature-2.cdn.mozilla.net/chains/aus.content-signature.mozilla.org-2022-12-30-09-21-28.chain [16.11.2022 10:30:08.579] Debug: (networking - NetworkRequest) Network request created by TaskRelease [16.11.2022 10:30:08.829] Debug: (networking - NetworkRequest) Network header received [16.11.2022 10:30:08.830] Debug: (networking - NetworkRequest) Network reply received - status: 200 - expected: 200 [16.11.2022 10:30:08.830] Debug: (networking - Balrog) Request completed [16.11.2022 10:30:08.830] Debug: (networking - Balrog) Checking the signature [16.11.2022 10:30:09.000] Debug: (networking - Balrog) BalrogGo: Verification failed with error: Error parsing cert chain: failed to parse root certificate from chain: x509: invalid authority key identifier [16.11.2022 10:30:09.000] Error: (networking - Balrog) Verification failed [16.11.2022 10:30:09.000] Error: (networking - Balrog) Invalid signature [16.11.2022 10:30:09.001] Debug: (networking - Balrog) Balrog released

data-sync-user commented 2 years ago

➤ Sarah Bird commented:

Bianca Hidecuti Valentina Virlics I have changed the “release found in” field to 2.10 because this is a problem upgrading from 2.10 to 2.11. It is not a 2.11 issue.

data-sync-user commented 2 years ago

➤ Sarah Bird commented:

Some notes from slack conversation:

Owen Kirby So, there's two possibilities:

  1. There is something wrong in the cert. It looks like it was updated sometime in October which might be causing the issue. If this is the case then the issue should affect both Mac and Windows, regardless of client version.
  2. There is a bug in the validation library client-side. This went through a bit of churn in the Windows builds as we switched to CMake, but utilmately it should have been reverted back to build as a DLL, just like in the old days. If this is the case then it should only affect Windows version 2.10 and later, I think.

Jon Buckley and Andrew Halberstadt helping to eliminate option 1 as the likely candidate:

data-sync-user commented 2 years ago

➤ Sarah Bird commented:

Bianca Hidecuti the current assumption is that initial Mac problems were due to balrog 24-hour cache and are now gone and that what we have left is a Windows issue. I have removed the Mac labels from the issue. Please let us know if this isn’t correct.

data-sync-user commented 1 year ago

➤ Valentina Virlics commented:

Sarah Bird the “Release found in field” is meant to expose the release in which an issue was found.

As a side note, we cannot check the update from 2.10 to 2.11, in prod, only after 2.11 release.

data-sync-user commented 1 year ago

➤ Sarah Bird commented:

{quote}Sarah Bird the “Release found in field” is meant to expose the release in which an issue was found.{quote}

Valentina Virlics we should discuss this more and get clarity what you’re saying doesn’t quite make sense to me.

{quote}As a side note, we cannot check the update from 2.10 to 2.11, in prod, only after 2.11 release.{quote}

Bringing in Adrienne Davenport here too. I can’t tell from this short note if we have a problem or not. We should review what upgrade tests are done, what they cover, and what holes they leave.

data-sync-user commented 1 year ago

➤ Josh Schrader commented:

Summary of my findings from today:

[17.11.2022 15:43:23.961] Debug: (networking - Balrog) Request completed [17.11.2022 15:43:23.961] Debug: (networking - Balrog) Checking the signature

Slack thread with full discussion: https://mozilla.slack.com/archives/C01DCUKG95E/p1668705592760209 ( https://mozilla.slack.com/archives/C01DCUKG95E/p1668705592760209|smart-link )

data-sync-user commented 1 year ago

➤ Bianca Hidecuti commented:

Sarah Bird, I can confirm that I am not able to reproduce the issue anymore on MacOS - the “Updates available” modal is displayed.

!Screen Shot 2022-11-18 at 08.32.23.png|width=358,height=656!

{quote}We should review what upgrade tests are done, what they cover, and what holes they leave.{quote}

Also, regarding this, we are testing the update flow from a previous version to the latest one (in this case from 2.10 to 2.11) before release, but on Stage server, because in order to check the update on production as well, we need Balrog - PROD to be updated, and this happens after the release.

As mentioned in the notes, this is not reproducing on Stage server while using the 2.10 version.

data-sync-user commented 1 year ago

➤ Valentina Virlics commented:

Sarah Bird Santiago Andrigo Adrienne Davenport Rebecca Billings

Hello everyone,

After reading the slack long thread about this issue and the notes below, QA wants to clarify 2 points:

If you have any other questions or unclarities regarding QA process, please let us know.

Thank you!

data-sync-user commented 1 year ago

➤ Sebastian Streich commented:

Somehow i can’t reproduce that bug with a build of 2.10 i produce locally, but the archive build is broken for me as well. I’ll check on Monday if i can create “broken” builds when i clone the TC runner env 😕

data-sync-user commented 1 year ago

➤ Andrew Halberstadt commented:

Yeah, 2.10 was the first release we made from the Taskcluster builds. So if I had to guess, I’d say it’s caused by something different in the TC build scripts (vs local build or GH actions)

data-sync-user commented 1 year ago

➤ Adrienne Davenport commented:

Owen Kirby has a PRhttps://github.com/mozilla-mobile/mozilla-vpn-client/pull/4968 ( https://github.com/mozilla-mobile/mozilla-vpn-client/pull/4968|smart-link ) in for this.

data-sync-user commented 1 year ago

➤ Bianca Hidecuti commented:

Verified this as fixed on Mozilla VPN 2.11.1 (2.202211232032) - STAGE server, using Windows 10/11, while following the steps described by Owen Kirby, in the following ticket: https://mozilla-hub.atlassian.net/browse/VPN-3357 ( https://mozilla-hub.atlassian.net/browse/VPN-3357|smart-link ).

After pressing the “Check for updates” button from “Abous us” screen → “Updates available” modal is displayed. The client is updated from 2.11.1 to 2.12 version, after clicking on the “Update now” button from the modal.

Attaching postfix video.

!bandicam 2022-11-24 13-59-11-520.mp4|width=1328,height=1044!