johanneswuerbach / ember-cli-sauce

Cross browser testing for your ember-cli app using SauceLabs
MIT License
40 stars 14 forks source link

Sauceconnect fails to transmit "all-test-results" event on polling transport #102

Open makepanic opened 7 years ago

makepanic commented 7 years ago

Hi, I'm trying to figure out why our tests won't complete in Safari 10. We're getting the usual Browser disconnected error. This error happens if browser_disconnect_timeout is reached.

While looking into the cause, i found out that Testem._isIframeReady never becomes true. I've added a simple check to see if _isIframeReady ever becomes true:

  setInterval(function(){
    var header =  document.querySelector('#qunit-header');
    if (header) {
      if (Testem._isIframeReady) {
        document.querySelector('#qunit-header').innerText = Date.now() + ' > iframe is ready :D';
      } else {
        document.querySelector('#qunit-header').innerText = Date.now() + ' > iframe is not ready';
      }
    }
  }, 5000);

As it turns out, it doesn't change on saucelabs Safari 10.

20170925-160903

Trying the same thing on an ember t --server with a local Safari 10, it'll always connect.

Another weird thing is that loading connection.html usually takes very long (30s+). I'm not done investigating but it looks like Safari has issues loading the socket/testem connection via sauceconnect.

Sometimes the iframe is ready but the socketio polling fallback returns a 500 for some POST requests.

20170925-200936

In general it looks like there's an issue when using the polling socketio transport. It disconnects and connects again. I guess it can't send the events to the server so it never gets notified that all tests were successful.

20170925-210911

It also looks like it sends the correct all-test-results event to the iframe. I guess the testem-client isn't at fault.

20170926-120915

I'll investigate further and report back if i find the cause. If anyone has any hints, I'd appreciate.

makepanic commented 7 years ago

As it turns out, the 500 disappears if i artificially reduce the test results size in the qunit_adapter before emitting the event.

With minimizing the payload to:

    results.tests = results.tests.slice(0, 10);
    results.total = 10;
    results.passed = 10;
    results.skipped = 0;

I get Safari 10 to constantly complete my tests

johanneswuerbach commented 7 years ago

Nice catch, I guess this logic should become smarter and not retransmit things already transmitted here https://github.com/testem/testem/blob/master/public/testem/qunit_adapter.js#L108 and assemble the final state on the server.

makepanic commented 7 years ago

I can also confirm that it now works on IE 11. My theory is that saucelabs connect somehow fails if we send too much data over the POST that gets created for the socketio polling transport fallback.

makepanic commented 7 years ago

@johanneswuerbach does testem need the data in the all-test-results event? I've found a listener for it in the following places:

Both of them ignore the functions argument. Can we maybe remove emitting the results or make it configurable?

johanneswuerbach commented 7 years ago

Sounds correct. I think the those results can be removed.

makepanic commented 7 years ago

I think there's still the issue that a single test-result can get too long.

The completing test above was a subset of our tests. If i run it on the whole test suite, it'll return 500 again sometimes. 🤔 Our longest test-results contains many assertions that get added to the payload as items.

One could transform the test-results to a more compact format like:

{"passed":true,"message":"mime: application/msword"}
-> 
[true, 'mime: application/msword']

But this will still break if one has many test-result items. Another way would be to gzip in the client which has the same issue for many test-result items :(

btw i've also contacted saucelabs and they're looking into the issue.

One possible solution, mentioned by the saucelabs support, is to add -B all when starting the sauceconnect connection. Sadly that didn't solve it for me.

makepanic commented 7 years ago

I had some time to investigate it further. After some exchange with the saucelabs team, I've updated to sauceconnect 4.4.9 (most recent version). Sadly this didn't solve the issue.

I've created a raw XMLHttpRequest loop that does a POST with an increasing request body length.

20170929-100907

It fails at exactly 100k chars.

The code looks like this:

  function request(len, continueFn) {
    var url = 'http://localhost:7000/sacket.io/?EIO=3&transport=polling&t=LxCZ6Jj';
    var req = new XMLHttpRequest();
    var payload = new Array(len).join(',');

    console.error(payload.length + ' POST');
    req.open('POST', url, true);
    req.onreadystatechange = function() {
      if (continueFn(req, payload.length)) {
        request(len + 1, continueFn);
      }
    }

    req.send(payload);
  }

  request(99950, function(req, payloadLength){
    if (req.status === 500) {
      console.error(payloadLength + ' went 500, stopping requests');
    }
    return req.readyState === 4 && req.status !== 500;
  });

Just got around checking Safari 10 and 11 on real hardware that connected to the testem server remotely. Both of them had no issue sending a large amount of payloads.

Weirdly enough it fails on Saucelabs Safari 10 and Edge 15 but not latest Chrome or Firefox.

I've also checked if it's maybe an issue with the testem server so i created a hapi server that reports the incoming request length:

const Hapi = require('hapi');

const server = new Hapi.Server();
server.connection({ port: 9999, host: '0.0.0.0', routes: {cors: true} });

server.route({
    method: 'POST',
    path: '/',
    handler: function (request, reply) {
        console.log('handled request', request.payload.length);
        reply('Hello, world!');
    }
});

server.start((err) => {
    if (err) {
        throw err;
    }
    console.log(`Server running at: ${server.info.uri}`);
});

Running it with the same config i receive the following logs on the server:

handled request 99995
handled request 99996
handled request 99997
handled request 99998
handled request 99999

Nothing appears after the 99999 payload length. This means the server didn't receive the request which points to a possible transport issue that prevents the request to reach the server.

makepanic commented 7 years ago

Just a small update:

After writing back and forth with the sauce support, we haven't found the issue yet. My current best guess is that it's something related to their proxy server.

Some things we've tried:

I've reduced the issue down to a couple curl commands which work fine locally but create a 500 on their side. See the screenshot for a failing request. Note the Server response header:

20171009-101039

johanneswuerbach commented 7 years ago

Thanks @makepanic for the update ❤️

Serabe commented 7 years ago

Hi, @makepanic,

do you have any update on this? Any way we can help you?

Thanks!

makepanic commented 7 years ago

The only update i can give is that they moved the issue from general support to the development team. Sadly the only update we've got after that was "We're working on it"

You could try to reproduce the issue by running https://gist.github.com/makepanic/dc0357e7c3e8220fbccd52eed53f962e over the sauceconnect tunnel.

You have to stop execution and hijack the running browser session (I think the breakpoint button does that). Download the script and run it. If it's reproducable outside of our connection then you'll receive a 500. Maybe we can poke them again if more (paying) customers can reproduce the issue.

makepanic commented 7 years ago

After poking them again, I received the following answer:

No news on this issue. It's in the backlog, and is currently a low priority issue. Is there a specific use case which you need to test which is blocked by this issue? You mentioned that this only affects older browsers.

simonihmig commented 7 years ago

You mentioned that this only affects older browsers.

OMG, really? If we were happy testing only on latest Chrome, what would we need Saucelabs for!? 🙄

makepanic commented 7 years ago

They've gave an update after i've explained the issue again and collected some related issues from github.

I've passed your comments and this information on to the team who will be working on this issue and I'll let you know of any updates and progress from our end.

I guess now it's time to wait until someone on their side picks the issue up :(

James1x0 commented 6 years ago

@makepanic Any update on your ticket almost a year later? We surpassed a certain number of tests and are bitten by this one. Anyone figure out how to fix this?

makepanic commented 6 years ago

I'll ask the person, that had contact with the saucelabs support, if there was any additional feedback from their side.

We've moved away from saucelabs as this issue was a showstopper for us.

If it's really the issue as mentioned in https://github.com/johanneswuerbach/ember-cli-sauce/issues/102#issuecomment-332920626 then the only solution would be to chunk test results to prevent the payload exceeding 100k chars.

James1x0 commented 6 years ago

@makepanic Thanks for the reply. Yeah, it's going to be the same for us. I've got a ticket open, but we may think about migrating away. Chunking tests is a workaround that shouldn't be required for how expensive SL is. 😞

makepanic commented 6 years ago

Just got a response from the person that was in contact with the saucelabs support. He didn't receive anything worthwhile and mostly useless "updates".

James1x0 commented 6 years ago

Update from sauce below. We have started migrating to browserstack. Similar api, no minutes bank 🎉. d55264f7-7825-4c8a-8e8a-56f64caadf6c

makepanic commented 6 years ago

I like how they mention that the issue is with ember-cli-sauce, ignoring that we gave them a reproduction that causes the server error by using a simple curl :man_facepalming:

btw I'm not sure if we can close this issue or not. I will probably not receive any updates as we don't use saucelabs anymore.

johanneswuerbach commented 6 years ago

I‘m happy to leave it open even if its just for posterity and to raise awareness. I also moved away from SauceLabs, but maybe someone is taking this over or this is finally resolved by them.

Did anyone check the new beta backend, is that better?

makepanic commented 5 years ago

Good news, everyone!

We received a mail from saucelabs:

screenshot_20190108_171214

I can't confirm if it's really fixed, but anyone still on saucelabs can check if it's actually working so we can close this issue.

I may be able to check it myself in a couple days/weeks.

James1x0 commented 5 years ago

Well, too late they lost us to browserstack 🤷‍♂️