mozilla / testpilot

Test Pilot is a platform for performing controlled tests of new product concepts in Firefox
https://testpilot.firefox.com/
250 stars 123 forks source link

Some users are unable to install the latest experiments #1474

Closed BrotherStein closed 7 years ago

BrotherStein commented 7 years ago

/edit by ckprice - updated title a bit to be more general than just Page Shot.

ckprice commented 7 years ago

Thanks for the report! Can you please restart your browser and try again.

digitarald commented 7 years ago

Same for Tracking Protection. Browser Console log: https://pastebin.mozilla.org/8913899

BrotherStein commented 7 years ago

Restarting my browser fixed it. Thanks!

digitarald commented 7 years ago

Same here, switched if off and on again.

marco-c commented 7 years ago

https://pastebin.mozilla.org/8913946

/edit by ckprice - more context in chat logs.

ckprice commented 7 years ago

I'm altering the title a bit to be more general, and duping out some other reports to this issue as it has a couple good log dumps in it.

ckprice commented 7 years ago

@SoftVision-PaulOiegas - could you guys come up with some reliable STR for this?

SoftVision-PaulOiegas commented 7 years ago

We have managed to reproduce this too but only happens if you have a Test Pilot version prior to the new experiments launch. However, the new experiments are correctly installed under the hood, the "Enabling..." state seems to be only an UI issue on the website at first glance. That's why refreshing the page or restarting the browser confirms that the experiments are installed after.

I also tried with a second profile that had even an older add-on version installed. Went to "about:addons" page and manually checked for updates. Test Pilot add-on updated and after I've enabled the new experiments almost instantly. So starting the old profile didn't triggered the auto check for add-on updates.

Checked with a third profile with the old add-on installed to see if the check for updates is eventually triggered. It took more than 5 minutes until the browser self updated the Test Pilot add-on.

Maybe we should implement a self check and update trigger for the add-on somehow ? Just an idea, you guys know better if something like this could be done.

Here are the steps to reproduce this: [Prerequisites]:

[Steps to reproduce]:

  1. Open Firefox with the profile from prerequisites.
  2. Click on Test Pilot icon in the browser toolbar.
  3. Choose one of the 3 new experiments.
  4. Click on "Enable " button and observe the page behavior.

[Expected results]:

[Actual results]:

[Notes]:

page shot install error min vid isntall old test pilot

marco-c commented 7 years ago

We have managed to reproduce this too but only happens if you have a Test Pilot version prior to the new experiments launch. However, the new experiments are correctly installed under the hood, the "Enabling..." state seems to be only an UI issue on the website at first glance. That's why refreshing the page or restarting the browser confirms that the experiments are installed after.

It happened with the latest Test Pilot version for me, and the experiments weren't correctly installed (when I refresh the page or restart the browser, the experiments were not installed).

SoftVision-PaulOiegas commented 7 years ago

@marco-c Do you get an error at top of the page when you try to enable them ? I'm, asking because your issue could be also related to firewall or antivirus configurations. We had in the past a few issues where users get an error when they try to install the experiments. But after testing outside the firewall network or adding permissions in the AV client, everything worked well.

Anyway, we are investigating this. It would help also if you could provide some browser console logs as I did above if you still reproduce the issue.

marco-c commented 7 years ago

@marco-c Do you get an error at top of the page when you try to enable them ?

There was no error at top of the page.

I'm, asking because your issue could be also related to firewall or antivirus configurations. We had in the past a few issues where users get an error when they try to install the experiments. But after testing outside the firewall network or adding permissions in the AV client, everything worked well.

I'm on Linux, so no antivirus 😄

Anyway, we are investigating this. It would help also if you could provide some browser console logs as I did above if you still reproduce the issue.

I already provided the logs a few comments ago.

I'm able to install them now, after restarting the browser.

Samizdata commented 7 years ago

In my case restarting the browser with cleared cache and history did not fix the problem. I had previously uninstalled and updated Test Pilot, which allowed Page Shot to install, but I am unable to install any further experiments or reinstall the experiments I used previously.

SoftVision-CosminMuntean commented 7 years ago

Managed to reproduce it too on Test Pilot stage server. Here is a screen recording with the issue and the workaround: https://goo.gl/AMr8z4. Here are some console errors observed while trying to enable Min Vid: https://goo.gl/h5YEyV.

chuckharmston commented 7 years ago
screen shot 2016-09-29 at 9 42 21 am

Keying in on this bit of @SoftVision-CosminMuntean's screenshot, I worked with @relud to sift through our access and error logs in nginx and our CDN logs. nginx has never served a 504, and there have only been 88 instances of proxy connection failures. Seems like this is a red herring, unfortunately.

ghost commented 7 years ago

Our current best theory is that upgrading the Test Pilot add-on and then attempting to install experiments fails. We don't know why, but we're expecting the problems to drop off rapidly as Firefox automatically upgrades the Test Pilot add-on.

Marking Needs UX to consider options for warning people here (eg. if enabling is taking too long). Also related #1335

bittin commented 7 years ago

I just noticed i have the same problem in Beta 3 of Firefox 50

log here: https://pastebin.mozilla.org/8915411

Samizdata commented 7 years ago

And now I am back at the exact same problem. I uninstalled, updated, installed Page Shot, anod now every other Test Pilot Experiment gives me the unable to install right now error I had in the first place.

bittin commented 7 years ago

https://pastebin.mozilla.org/8915415

jaredhirsch commented 7 years ago

I was just discussing this issue with @bittin, who encountered it using the 9/12 version of the addon with FF 50 Beta, and it appears that only new experiments are affected by this bug:

Possibly related: uninstalling the 9/12 build of Test Pilot, restarting Firefox, then attempting to reinstall the Test Pilot addon from the Test Pilot website fails. (@bittin, please correct me if I've missed anything)

Samizdata commented 7 years ago

Currently, with Firefox 49.1 and current Test Pilot, that would be incorrect. After uninstalling and reinstalling to install Page Shot, I am currently unable to install ANY Experiments, including No More 404's (other than the Page Shot which was the genesis of the problem for me). Which is really frustrating since that was the most productive experiment for me to date.

jaredhirsch commented 7 years ago

@Samizdata: Thanks for the info! A few followup questions:

What version of Test Pilot do you have installed? (about:addons > Extensions tab at left > click 'More' link next to Test Pilot listing)

What if you uninstall and reinstall any experiment other than Page Shot? Can you then install a second experiment successfully?

Samizdata commented 7 years ago

Test Pilot 0.8.5-tag-2016-09-27.

I didn't try uninstalling/reinstalling anything else, because, frankly, I was afraid of breaking things worse. I need to head to work, but I will test it out when I get home tonight.

chuckharmston commented 7 years ago

It'd be very helpful if we could get an about:support dump from somebody who is experiencing this problem. To do that, go to about:support, then hit "Copy text to clipboard". Either paste that in a comment here, or if you'd prefer, an email to me (chuck@mozilla.com).

@bittin @Samizdata

Thanks!

Samizdata commented 7 years ago

Application Basics

Name: Firefox Version: 49.0.1 Build ID: 20160922113459 Update Channel: release User Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0 OS: Windows_NT 10.0 Multiprocess Windows: 0/1 (Disabled by add-ons) Safe Mode: false

Crash Reports for the Last 3 Days

All Crash Reports

Extensions

Name: BugMeNot Plugin Version: 3.1-signed.1-signed Enabled: true ID: {987311C6-B504-4aa2-90BF-60CC49808D42}

Name: Free Memory Button Version: 1.1.2.1-signed Enabled: true ID: tb-free-memory-single@codefisher.org

Name: Ghostery Version: 7.0.1.4 Enabled: true ID: firefox@ghostery.com

Name: Google Image Search Version: 1.15.1-signed.1-signed Enabled: true ID: {73007fef-a6e0-47d3-b4e7-dfc116ed6f65}

Name: Greasefire2 Version: 2.1.1 Enabled: true ID: greasefire2@b0nk3rz.net

Name: Greasemonkey Version: 3.9 Enabled: true ID: {e4a8a97b-f2ed-450b-b12d-ee082ba24781}

Name: Greasy Scripts Version: 1.0.8 Enabled: true ID: greasyscripts@ede123

Name: LastPass Version: 3.3.1 Enabled: true ID: support@lastpass.com

Name: Multi-process staged rollout Version: 1.3 Enabled: true ID: e10srollout@mozilla.org

Name: OverbiteFF Version: 3.1.1695 Enabled: true ID: overbiteff@floodgap.com

Name: Pocket Version: 1.0.4 Enabled: true ID: firefox@getpocket.com

Name: QR Secret Decoder Ring Version: 0.7.0 Enabled: true ID: jid0-P1mVozMDmRBfUBi7YQaBXqktLCM@jetpack

Name: QuickWikiEditor Version: 0.8.3.1-signed.1-signed Enabled: true ID: jid0-hfviyWEDI6edV0BwIzEOoUMbcKI@jetpack

Name: Random Agent Spoofer Version: 0.9.5.6 Enabled: true ID: jid1-AVgCeF1zoVzMjA@jetpack

Name: Reddit Enhancement Suite Version: 5.0.2 Enabled: true ID: jid1-xUfzOsOFlzSOXg@jetpack

Name: Restart Version: 1.2.8 Enabled: true ID: Restart@schuzak.jp

Name: Restartless Restart Version: 9.1-signed.1-signed Enabled: true ID: restartless.restart@erikvold.com

Name: Resurrect Pages Version: 3 Enabled: true ID: {0c8fbd76-bdeb-4c52-9b24-d587ce7b9dc3}

Name: Search by Image for Google Version: 1.2.0.1-signed.1-signed Enabled: true ID: {ab4b5718-3998-4a2c-91ae-18a7c2db513e}

Name: Skype Version: 8.3.0.9150 Enabled: true ID: {82AF8DCA-6DE9-405D-BD5E-43525BDAD38A}

Name: SomethingAwful Last Read Enhancement Version: 4.1.1 Enabled: true ID: {fc6339b8-9581-4fc7-b824-dffcb091fcb7}

Name: Stylish Version: 2.0.7 Enabled: true ID: {46551EC9-40F0-4e47-8E18-8E5CF550CFB8}

Name: Tab Mix Plus Version: 0.5.0.0 Enabled: true ID: {dc572301-7619-498c-a57d-39143191b318}

Name: Test Pilot Version: 0.8.5-tag-2016-09-27 Enabled: true ID: @testpilot-addon

Name: TrackMeNot Version: 0.9.2 Enabled: true ID: trackmenot@mrl.nyu.edu

Name: uBlock Origin Version: 1.9.12 Enabled: true ID: uBlock0@raymondhill.net

Name: UnMHT Version: 8.2.0 Enabled: true ID: {f759ca51-3a91-4dd1-ae78-9db5eee9ebf0}

Name: Web Compat Version: 1.0 Enabled: true ID: webcompat@mozilla.org

Name: YouTube Center Version: 2.1.0.1-signed.1-signed Enabled: true ID: jid1-cwbvBTE216jjpg@jetpack

Name: Zoho Vault Version: 2.2 Enabled: true ID: jid1-jAUWJuglb1SqUQ@jetpack

Name: Avast Online Security Version: 12.0.88 Enabled: false ID: wrc@avast.com

Name: Avast SafePrice Version: 10.3.5.39 Enabled: false ID: sp@avast.com

Name: MLKSHK Image Picker Version: 1.4 Enabled: false ID: hello@mlkshk.com

Name: VideoGet FireFox extension Version: 2.1 Enabled: false ID: {85E85FF9-E50C-42DE-8A3D-61485FD6C8DB}

Graphics

Features Compositing: Direct3D 11 Asynchronous Pan/Zoom: none WebGL Renderer: Google Inc. -- ANGLE (NVIDIA GeForce GTX 760 Direct3D11 vs_5_0 ps_5_0) Hardware H264 Decoding: No; DXVA2D3D11 crashes detected in the past; DXVA2D3D9 crashes detected in the past Direct2D: true DirectWrite: true (10.0.10586.589) GPU #1 Active: Yes Description: NVIDIA GeForce GTX 760 Vendor ID: 0x10de Device ID: 0x1187 Driver Version: 21.21.13.7270 Driver Date: 8-25-2016 Drivers: nvd3dumx,nvwgf2umx,nvwgf2umx,nvwgf2umx nvd3dum,nvwgf2um,nvwgf2um,nvwgf2um Subsys ID: 27653842 RAM: 2048

Diagnostics AzureCanvasAccelerated: 0 AzureCanvasBackend: direct2d 1.1 AzureContentBackend: direct2d 1.1 AzureFallbackCanvasBackend: cairo failures: [GFX1-]: DXVA2D3D11 video decoding is disabled due to a previous crash.

Crash Guard Disabled Features D3D9 Video Decoder: Reset on Next Restart D3D11 Video Decoder: Reset on Next Restart

Failure Log (#0) Error: DXVA2D3D11 video decoding is disabled due to a previous crash. (#1) Error: DXVA2D3D9 video decoding is disabled due to a previous crash. (#2) Error: DXVA2D3D11 video decoding is disabled due to a previous crash. (#3) Error: DXVA2D3D9 video decoding is disabled due to a previous crash.

Important Modified Preferences

accessibility.lastLoadDate: 1475544011 accessibility.loadedInLastSession: true accessibility.typeaheadfind.flashBar: 0 browser.cache.disk.capacity: 358400 browser.cache.disk.filesystem_reported: 1 browser.cache.disk.smart_size.first_run: false browser.cache.disk.smart_size.use_old_max: false browser.cache.frecency_experiment: 4 browser.download.importedFromSqlite: true browser.download.manager.alertOnEXEOpen: true browser.places.smartBookmarksVersion: 8 browser.sessionstore.upgradeBackup.latestBuildID: 20160922113459 browser.startup.homepage_override.buildID: 20160922113459 browser.startup.homepage_override.mstone: 49.0.1 browser.tabs.remote.autostart.2: true browser.urlbar.daysBeforeHidingSuggestionsPrompt: 3 browser.urlbar.lastSuggestionsPromptDate: 20160819 browser.urlbar.suggest.searches: true dom.apps.lastUpdate.buildID: 20160922113459 dom.apps.lastUpdate.mstone: 49.0.1 dom.apps.reset-permissions: true dom.max_script_run_time: 0 dom.push.userAgentID: a82722792f294b5f92f8b4ce8a27739a extensions.lastAppVersion: 49.0.1 font.internaluseonly.changed: false gfx.crash-guard.d3d11layers.appVersion: 49.0.1 gfx.crash-guard.d3d11layers.deviceID: 0x1187 gfx.crash-guard.d3d11layers.driverVersion: 21.21.13.7270 gfx.crash-guard.d3d11layers.feature-d2d: true gfx.crash-guard.d3d11layers.feature-d3d11: true gfx.crash-guard.status.: 2 gfx.crash-guard.status.d3d11layers: 2 gfx.crash-guard.status.d3d11video: 3 gfx.crash-guard.status.d3d9video: 3 media.benchmark.vp9.fps: 113 media.benchmark.vp9.versioncheck: 1 media.gmp-eme-adobe.abi: x86-msvc-x64 media.gmp-eme-adobe.lastUpdate: 1471657227 media.gmp-eme-adobe.version: 17 media.gmp-gmpopenh264.abi: x86-msvc-x64 media.gmp-gmpopenh264.lastUpdate: 1471657232 media.gmp-gmpopenh264.version: 1.6 media.gmp-manager.buildID: 20160922113459 media.gmp-manager.lastCheck: 1475520723 media.gmp-widevinecdm.abi: x86-msvc-x64 media.gmp-widevinecdm.lastUpdate: 1471657253 media.gmp-widevinecdm.version: 1.4.8.866 media.gmp.storage.version.observed: 1 media.hardware-video-decoding.failed: false media.peerconnection.ice.default_address_only: true media.webrtc.debug.log_file: C:\Users\migra\AppData\Local\Temp\WebRTC.log network.cookie.prefsMigrated: true network.dns.disablePrefetch: true network.http.speculative-parallel-limit: 0 network.predictor.cleaned-up: true network.prefetch-next: false places.database.lastMaintenance: 1475367371 places.history.expiration.transient_current_max_pages: 104858 plugin.disable_full_page_plugin_for_types: application/pdf plugin.importedState: true services.sync.declinedEngines: services.sync.engine.greasemonkey: true services.sync.engine.prefs: false services.sync.engine.prefs.modified: false services.sync.lastPing: 1475451227 services.sync.lastSync: Mon Oct 03 2016 15:54:02 GMT-0500 (Central Standard Time) services.sync.numClients: 2 storage.vacuum.last.index: 1 storage.vacuum.last.places.sqlite: 1474277984 ui.osk.debug.keyboardDisplayReason: IKPOS: Touch screen not found.

Important Locked Preferences

Places Database

JavaScript

Incremental GC: true

Accessibility

Activated: true Prevent Accessibility: 0

Library Versions

NSPR Expected minimum version: 4.12 Version in use: 4.12

NSS Expected minimum version: 3.25 Version in use: 3.25

NSSSMIME Expected minimum version: 3.25 Version in use: 3.25

NSSSSL Expected minimum version: 3.25 Version in use: 3.25

NSSUTIL Expected minimum version: 3.25 Version in use: 3.25

Experimental Features

Is that okay? And, yeah, my hypothesis was right. Disabled Page Shot and it will not enable. Same as before.

https://i.imgur.com/YzguGct.png

Samizdata commented 7 years ago

And, apparently now, we are right back where we were. Reinstalled Test Pilot, reinstalled Page Shot, and can NOT install any other experiments.

Update. Can not reinstall No More 404's, but Min Vid DID install. Also, no go on Universal Search.

jaredhirsch commented 7 years ago

@Samizdata Could you try again in a fresh profile (or with non-Test Pilot add-ons disabled)? I wonder if one of your other add-ons is interfering with the download process.

Samizdata commented 7 years ago

Let me back up my profile and see what I can do.

Samizdata commented 7 years ago

Okay, profile moved. Started with clean profile. Installed Test Pilot. No go on No More 404s. So it looks like it is not another addon.

SoftVision-CosminMuntean commented 7 years ago

@Samizdata do you have any anti-virus or firewalls installed? Maybe your issue could be also related to firewall or antivirus configurations. We had a few issues where users could not install the experiments until they disabled the antivirus.

johngruen commented 7 years ago

maybe related: https://github.com/mozilla/testpilot/issues/660

Samizdata commented 7 years ago

@SoftVision-CosminMuntean I will try that tonight when I get home from work, but that's not a very good solution. Test Pilot is the only time I have had issues with installing addons. Maybe providing them through the standard add on infrastructure?

Samizdata commented 7 years ago

@SoftVision-CosminMuntean Okay, disabled all shields in Avast Premiere and the plugins DID install with no error. So, all good.

chuckharmston commented 7 years ago

Thanks for the help, @Samizdata! This is the first time we've gotten confirmation that antivirus software is actually to blame here. Could I get you to share your configuration with me, either here or privately (chuck@mozilla.com)? I'd like to narrow down specifically which aspects of the suite are causing the problems.

chuckharmston commented 7 years ago

I've managed to reproduce this, and believe I have the issue: these internet "security" suites are installing their own root certificate and issuer, so that they can monitor encrypted traffic. Firefox (very intelligently) blocks all add-on installs from sources whose certificate issuer is not built-in. I've observed this happening to us with Avast installed:

screen shot 2016-10-06 at 8 52 55 am

I was able to reproduce this with or without the Avast add-on enabled, and with or without any analytics tracking installed, so long as the MITMing behavior is happening at a system level. I was not able to find any test case inconsistent with this.

You can tell if you're subject to this by looking at the certificate issuer on the Test Pilot site.

Expected:

screen shot 2016-10-06 at 9 20 47 am

Problematic:

screen shot 2016-10-06 at 9 20 28 am

(Or anything else that isn't DigiCert Inc)

You can disable this behavior by creating and setting a boolean preference extensions.install.requireBuiltInCerts to false, but this is unsafe, and we should not make this recommendation for users. It is helpful for testing and reproduction, though.

Samizdata commented 7 years ago

Confirming the MITM behavior independently.

chuckharmston commented 7 years ago

Recommendations:

  1. We attempt to detect the presence of the Firefox extensions installed by problematic suites. If any of them are installed, prompt them with a message telling them that the site may not work for them. Since this problem happens at a system level, rather than at the browser level, this is not a foolproof solution.
  2. We investigate ways to tell if the certificate issuer is built-in. Our site is likely not able to do this, but we may be able to detect this from the add-on. This would help with experiment installation failures, but not with the Test Pilot installation failing.
  3. We follow through with my recommendation in #1335 to add a timeout to the install process. When doing so, we can list this as a possible reason for the failure.
chuckharmston commented 7 years ago

Filed #1537 with a fix for this. I don't think it should preclude us from working on recommendations 1 or 3 above.

rabimba commented 7 years ago

I am still able to replicate this issue, only for Tracking Protection and I do not have any security suite installed. Also this is what I get image

Though I am actually behind an educational internet backbone. So don't know if anything on that network can cause this.

Menimue commented 7 years ago

I have managed to install 'Tracking Protection'. It wouldn't install at all, tried many times over a period of days. I have just tried again and it's installed no problem. I haven't changed any settings on my pc or made any adjustments to antivirus permissions. It just worked. So I'm a happy bunny now thank you to everyone for the help..

SoftVision-PaulOiegas commented 7 years ago

@rabimba If you have the latest Test Pilot version installed, don't have any antivirus client and you still cannot install the experiments, then the educational network could be the problem.

@Menimue The dev team found out what was causing this and we managed to provide a fix for the issue. I assume that's why you have been able to install it now. Enjoy the experiments and feel free to provide any feedback for the tested ones so we could improve them.

Dainius14 commented 7 years ago

@chuckharmston Your suggestion didn't work, as there's no such option in my Firefox 49.0.1 extensions.install.requireBuiltInCerts. Turning off SSL/TLS protocol filtering in ESET AV did the trick.

chuckharmston commented 7 years ago

@Dainius14 That preference doesn't exist by default; you need to create it in order for it to exist.

That said, I do not recommend setting that preference to get around this issue; it makes your browser notably less safe. It's only helpful for debugging the problem.

johngruen commented 7 years ago

lets talk about this in the meeting

SoftVision-PaulOiegas commented 7 years ago

From QA perspective it seams that the add-on may have some backwards compatibility issues. After the last 2-3 production releases, mostly the end users, encountered experiments enabling problems. In most of the cases the problem got fixed by manually checking for add-ons updates or reinstalling the add-on. This means that if a new version launches users are unable to enable the experiments from the old version until they update to the latest. This combined with the amount of time Firefox needs until automatically checks for add-ons update is generating lately this kind of issues to appear.

I can dig for several issues from the other experiments related to this if needed. but it will take some time. What I have in reach are the next two:

johngruen commented 7 years ago

Best we can do is message on a timeout. Most of this bug happens upstream.