mozilla / persona

Persona is a secure, distributed, and easy to use identification system.
https://login.persona.org
Other
1.83k stars 265 forks source link

IE9: mysterious java error causing tests to fail & hang for 10+ minutes #2768

Closed jaredhirsch closed 11 years ago

jaredhirsch commented 11 years ago

3/5 recent failed runs were caused by this mysterious java crash.

Some recent IE9 jobs that failed due to the nefarious java plugin crash:

11/26 19:20 - #1958

11/26 19:04 - #1956

11/26 15:36 - #1955

jaredhirsch commented 11 years ago

Spoke with @santiycr in #saucelabs on freenode. He's opening a support ticket on their side & will update me by email. I'll keep this ticket up to date.

_6a68 hey sauce. I'm seeing java SE plugin crashes on a lot of IE9 tests lately. anybody heard of such a thing? 10:55 _6a68 take a look at this test: https://saucelabs.com/tests/8769ff0e8fbf420ba62a409ef61dd48b# 10:56 _6a68 if you look at the video, the test runs fine up to about 50 seconds in, then the crash dialog pops up ("java platform SE binary has stopped working"), then the test hangs for another 11 minutes :-\ 10:57 _6a68 I've seen this several other times in the last day; obviously it's a bummer to see tests crashing due to infrastructure failure 10:58 santiycr sorry to hear that, _6a68 11:23 santiycr we'll investigate 11:23 santiycr what's your email address so we can follow up? 11:23 _6a68 santiycr: hey! 11:25 _6a68 santiycr: if it helps, I can send you links to a couple other ie9 tests that failed in the same mysterious way 11:27 _6a68 santiycr: I'm 6a68@mozilla.com 11:27 _6a68 santiycr: here's another IE9 java crash https://saucelabs.com/tests/988c37ec53a24b67bf1211825f92fa49# 11:31 _6a68 santiycr: and another: https://saucelabs.com/tests/a9798be908b842b3992406a6c417f7bd# 11:32 santiycr thanks! 11:32 _6a68 santiycr: I'm tracking the issue over here, you're welcome to comment in the ticket or email me (whichever is fine) 11:32 santiycr I'll create a support ticket, it will help track this better 11:32 _6a68 santiycr: https://github.com/mozilla/browserid/issues/2768 11:32 _6a68 awesome! I'd love the URL to the support ticket if that exists 11:33 santiycr fo sho 11:33 santiycr I'll take the rest by email, to make sure we keep track of the investigation and questions 11:34 _6a68 awesome

jaredhirsch commented 11 years ago

Dunno if this is public or not, but I opened a sauce help desk ticket:

http://support.saucelabs.com/requests/3536

santiycr commented 11 years ago

I've been having issues with Zendesk, therefore have been unable to create the ticket yesterday. While I deal with that myself, I'll send you the response to your ticket right here:


Thanks for creating the ticket here, Jared. This is the right place for communicating about it.

Regarding the issue, is there any particular change in your tests or application that could have caused this? Did this problem start at some particular point in time or has been happening since you started testing on IE9?

While we proceed with the investigation, let me suggest two desired capabilities that might help understanding this and hopefully address it: Set the "selenium-version" capability to "2.26.0" Set a "avoid-proxy" capability to true.

The first one will provide your tests with the latest Selenium version available (instead of 2.18.0, which we provide by default on IE9). The second one tells our service to avoid proxying our requests, this will disable support for invalid HTTPS certificates, but addresses a performance issue caused by the HTTP proxy in Selenium versions over 2.18.0.

For more details: http://saucelabs.com/docs/ondemand/additional-config#selenium-version http://saucelabs.com/docs/ondemand/additional-config#avoid-proxy https://code.google.com/p/selenium/issues/detail?id=3498

Let us know how it goes.

Best, Santi

jaredhirsch commented 11 years ago

Response from help desk:

Alex Glowaski (Sauce Labs Help Desk)

Nov 28 11:54 am (PST)

Hi Jared,

Thanks for reporting this. We're working to resolve this now, and the next step is to verify the version of Java in use on our machines. This may be resolved by updating Java.

If this is the case, we'll need to rebuild the image from which the Windows 2008 VMs are launched, and I'll work on that today. I will let you know as soon as we have an update.

Best, Alex

jaredhirsch commented 11 years ago

@santiycr looks like hard-coding the selenium server version does fix it, but I'm not totally sure: I'm having trouble reproducing the problem on the old selenium server version this evening.

I'll give it another shot in the morning and follow up with Alex to see if the VMs wound up getting an updated version of Java installed on them (maybe that's why I can't repro the errors tonight)

jaredhirsch commented 11 years ago

@santiycr hard-coding the selenium version does indeed fix it. Looks like I forgot to mention that we've always had avoid-proxy set in our tests, so that's definitely not a factor with this bug.

Here are some runs from this afternoon using the default selenium version, where we see the java error and the test hanging for 10 minutes or more:

https://saucelabs.com/tests/f9ec6fa8f6984be680f2455af3fe2226#
https://saucelabs.com/tests/a113e0a0a4b24e51b4fe7d8c6a28ba6b#
https://saucelabs.com/tests/8afc0a0becc643ae9cc48a8d0a12f485#
https://saucelabs.com/tests/a9702b0959f5473fa3a2fa53a79ad3fa#

I haven't been able to generate these same errors with selenium-server set to 2.26.0.

I'm not thrilled about having to hard-code the server version like this. What's the probable timeline for this version to become the IE9 default? At least I can set a reminder to toggle this config setting down the road.

We can troubleshoot further if it seems worthwhile to you; I'm happy that the error is gone. I'm going to run the tests a few more times using the hard-coded sever setting and see if I can repro the java error, but we're looking really good right now.

santiycr commented 11 years ago

Happy to hear that! We don't have an ETA on when we'll be able to move IE browsers to the most recent Selenium version as that is blocked in a Selenium bug not a Sauce one. I'm planning to start investigating it and hopefully address it, but again can't make estimates I feel comfortable yet.

When this happens and we're ready for a server-wide upgrade, we'll let you (and the rest of our user-base) know at least 2 weeks in advance.

Don't hesitate to let us know if this issue come up again.

Best, Santi

jaredhirsch commented 11 years ago

@santiycr I ran a bunch more tests [1] and didn't see any java errors. Thanks much!

[1] https://gist.github.com/4173166

santiycr commented 11 years ago

\o/

shane-tomlinson commented 11 years ago

Thanks @santiycr!

@6a68 - I feel in this case it is acceptable for us to hard code the version string. All of our npm dependencies are tied to a particular version so that we can be sure that new bugs are introduced whenever dependencies are updated.

Last week we saw a problem with vows when its version was set to "*", vows updated but as part of the update they introduced a bug where files that started with the name "c" were unable to be tested.