web-platform-tests / wpt

Test suites for Web platform specs — including WHATWG, W3C, and others
https://web-platform-tests.org/
Other
4.98k stars 3.09k forks source link

Some wpt integration tests being skipped because port in use #9546

Open gsnedders opened 6 years ago

gsnedders commented 6 years ago

From https://github.com/w3c/web-platform-tests/pull/9480#issuecomment-365787562.

The master branch, in commit 040654c78f3f021512a1f22b9b3b99925bd52d37 (from #8614), has ./wpt run chrome totally broken (it always throws KeyError). This shouldn't have managed to land, because we have a test, test_run_chrome in tools/wpt/tests/test_wpt.py that should've caught this.

This issue has two sides to it:

vinicius0197 commented 6 years ago

I've tried running test_run_chrome with pytest -k chrome wpt/tests/test_wpt.py, first with the commit 040654c78f3f021512a1f22b9b3b99925bd52d37 checked out. That gave me the following error:

================================================================================= FAILURES =================================================================================
_____________________________________________________________________________ test_run_chrome ______________________________________________________________________________

manifest_dir = '/tmp/tmpC547g1'

    @pytest.mark.slow
    @pytest.mark.xfail(sys.platform == "win32",
                       reason="Tests currently don't work on Windows for path reasons")
    def test_run_chrome(manifest_dir):
        if is_port_8000_in_use():
            pytest.skip("port 8000 already in use")

        with pytest.raises(SystemExit) as excinfo:
            wpt.main(argv=["run", "--yes", "--no-pause", "--binary-arg", "headless",
                           "--metadata", manifest_dir,
>                          "chrome", "/dom/nodes/Element-tagName.html"])

wpt/tests/test_wpt.py:104: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
wpt/wpt.py:132: in main
    rv = script(*args, **kwargs)
wpt/run.py:419: in run
    rv = run_single(venv, **kwargs) > 0
wpt/run.py:426: in run_single
    return wptrunner.start(**kwargs)
wptrunner/wptrunner/wptrunner.py:295: in start
    return not run_tests(**kwargs)
wptrunner/wptrunner/wptrunner.py:185: in run_tests
    env_extras) as test_environment:
wptrunner/wptrunner/environment.py:96: in __enter__
    self.config = self.load_config()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <wptrunner.environment.TestEnvironment object at 0x7f5a19d9fad0>

    def load_config(self):
        default_config_path = os.path.join(serve_path(self.test_paths), "config.default.json")
        local_config_path = os.path.join(here, "config.json")

        with open(default_config_path) as f:
            default_config = json.load(f)

        with open(local_config_path) as f:
            data = f.read()
>           local_config = json.loads(data % self.options)
E           KeyError: 'host'

wptrunner/wptrunner/environment.py:132: KeyError
--------------------------------------------------------------------------- Captured stdout call ---------------------------------------------------------------------------
Downloading chromedriver
 0:03.14 INFO Updating test manifest /tmp/tmpC547g1/MANIFEST.json
 0:03.15 INFO STDOUT: INFO:manifest:Skipping manifest download because existing file is recent
 0:07.14 INFO STDOUT: INFO:manifest:Updating manifest
 0:14.27 INFO STDOUT: DEBUG:manifest:Opening manifest at /tmp/tmpC547g1/MANIFEST.json
 0:18.69 INFO Using 1 client processes
 0:18.74 INFO Closing logging queue
 0:18.74 INFO queue closed
============================================================================= warnings summary =============================================================================
None
  pytest-catchlog plugin has been merged into the core, please remove it from your requirements.

-- Docs: http://doc.pytest.org/en/latest/warnings.html
============================================================================ 9 tests deselected ============================================================================
====================================================== 1 failed, 1 passed, 9 deselected, 1 warnings in 32.21 seconds =======================================================

Tried again with another commit 28ceecaaa6f0ff2cae0c586c54a923a1246393c3 checked out and received no error messages.

Running the test with pytest wpt/tests/test_wpt.py I received no error messages as well, but test_run_chrome is skipped with the message: port 8000 already in use. So the problem is probably with the way tests are skipped if there's a server running on that port.

I will try to work on a solution for this bug. @jgraham suggested adding a loop in the check with a sleep between checks.

vinicius0197 commented 6 years ago

Adding a loop with a sleep between checks didn't work. test_run_chrome keeps being skipped. I've put this test into a separate file inside tools/wpt/tests and this way it did catch chrome breakage. Maybe the other tests inside tools/wpt/tests/test_wpt.py are interfering with it?

gsnedders commented 6 years ago

If it's getting skipped because port 8000 is already in use then presumably something has gone wrong elsewhere, given it should be free. Maybe we should also add a check that if the TRAVIS environment variable is set then we should never skip…

vinicius0197 commented 6 years ago

@gsnedders hm, so the test will only be skipped if both TRAVIS environment variable is not set and port 8000 is in use?

     if os.environ.get('TRAVIS') == "false" and is_port_8000_in_use == "true":
         pytest.skip("TRAVIS environ variable not set")

Something like this, maybe?

I'm not sure yet about what is causing the problem on port 8000, though.

gsnedders commented 6 years ago

@vinicius0197 yeah, something like that. (well, maybe more like os.environ.get("TRAVIS", "false") == "false" so it actually works when it's not set. :))

vinicius0197 commented 6 years ago

@gsnedders Ok. I've noticed that test_serve is also being skipped with the same message (port 8000 already in use). So we should probably fix that as well with the same check, I guess?

As for why port 8000 is causing problems, should a new issue be opened for that matter (since it looks like it's affecting not only test_run_chrome)? If so, I can submit a PR to this one with the changes I've made.

gsnedders commented 6 years ago

@vinicius0197 We should generalise this issue. :)

gsnedders commented 6 years ago

@vinicius0197 did you find anything about the cause of port 8000 being in use?

vinicius0197 commented 6 years ago

@gsnedders sorry for the late reply! Didn't had time this last week to look into this, but I will be doing this tomorrow and will update this thread.

vinicius0197 commented 6 years ago

@gsnedders hm, for some reason the tests aren't being skipped in pytest anymore.

That's odd. You've mentioned that this may be a problem with the test being skipped on Travis (since the test iteself, as it's written, fails). I will take a look into that.

vinicius0197 commented 6 years ago

Two tests were skipped on the log of commit 040654c78f3f021512a1f22b9b3b99925bd52d37 on Travis. I suspect one of those was test_run_chrome.

Is it possible to make Travis output a more precise log of which tests actually run? I think that would help detecting where the problem really is.

jgraham commented 6 years ago

You can edit the pytest options in the tox.ini file. If you pass in --verbose and -s you should get a lot more helpful output.

vinicius0197 commented 6 years ago

Running pytest locally with--verbose and -sI get the following output: https://pastebin.mozilla.org/9079374. It's easy to see that test_run_chrome was skipped (line 132). But, when I run without --verbose and -s, the test runs normally and I get an error (as expected). This is really weird.

To test Travis, I've created a new branch on my fork of web-platforms-tests with commit 040654c78f3f021512a1f22b9b3b99925bd52d37 checked out. But when the tests are running I get the following output: The command "./tools/ci/run.sh" exited with 0. The Travis log is here. I don't know what can be causing this issue (I think I've configured everything correctly, at least). Am I doing something wrong here?

I want to set up this Travis build correctly and try to isolate the root of this problem. Right now it looks like something is odd with pytest.

gsnedders commented 6 years ago

@vinicius0197 add RUN_JOB=1 to the env section of the job in .travis.yml; by default only jobs with changes get run.

vinicius0197 commented 6 years ago

@gsnedders ok. I will try again later today

vinicius0197 commented 6 years ago

@gsnedders hm, still getting the same message: https://travis-ci.org/vinicius0197/web-platform-tests/jobs/352635350

Very newbie using Travis, so I'm running into some issues in there.

jgraham commented 6 years ago

It looks like something is failing in the tools/ci/before_install.sh script; perhaps the git submodule update because origin/master isn't a valid ref. I think the only thing in that script that might be significant in your case is starting xvfb so you could replace the body of that script with

    export DISPLAY=:99.0
    sh -e /etc/init.d/xvfb start 1>&2

and see if that improves things?

vinicius0197 commented 6 years ago

@jgraham Yes, it worked! At least it builds now.

I'm receiving the following output:

============================= test session starts ==============================
platform linux2 -- Python 2.7.14, pytest-3.4.2, py-1.5.2, pluggy-0.6.0
rootdir: /home/travis/build/vinicius0197/web-platform-tests/tools, inifile: pytest.ini
plugins: cov-2.5.1, mozlog-3.7, hypothesis-3.49.1
collected 12 items                                                             
tests/test_stability.py .                                                [  8%]
tests/test_wpt.py ..
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated

I've added--verboseto thetox.ini file according to this page. I will investigate the cause of this issue tomorrow.

gsnedders commented 6 years ago

@vinicius0197 I feel like this is ever less deserving of the "good first issue" label, but also yay! :)

vinicius0197 commented 6 years ago

@gsnedders it's a bit hard to debug what's going on with Travis. I tried adding a new environment variable for pytest to show debug info, but same things as above happens: https://travis-ci.org/vinicius0197/web-platform-tests/jobs/354897859

So I tried following the hint it gives and added a new line to .travis.yml: install: travis_wait mvn install

So it would wait a little more before escaping the build. But for some reason this command makes mvn exits with 1, as in: https://travis-ci.org/vinicius0197/web-platform-tests/jobs/354900881

Any ideas on what might be causing this problem with mvn?

gsnedders commented 6 years ago

That hint is Java specific; I've seen us randomly time out there before in the previous build, maybe just try retrying that. We should probably just let test_wpt hit the remote server and deal with the potential flakes (given we're already reliant on GitHub to download the repo in the first place).

vinicius0197 commented 6 years ago

@gsnedders hm, it keeps giving the same error (after retrying).

Could you clarify a little bit on what test_wpt should do (about hitting the remote server and dealing with flakes)? Not sure if I really understood that.

gsnedders commented 6 years ago

@vinicius0197

Could you clarify a little bit on what test_wpt should do (about hitting the remote server and dealing with flakes)? Not sure if I really understood that.

Remove --no-download from test_wpt.py.

vinicius0197 commented 6 years ago

The build is now throwing a lot more helpful output. Travis is catching two errors (in test_run_firefox and test_run_chrome), but it's a different error than the original KeyError.

https://travis-ci.org/vinicius0197/web-platform-tests/jobs/357075576

It's getting the following error:

ERROR: InvocationError: '/home/travis/build/vinicius0197/web-platform-tests/tools/wpt/.tox/py27/bin/pytest --cov'

vinicius0197 commented 6 years ago

I've tested running Travis with other commits checked out and it was fine. But it gives problems when I'm using:

[pytest]
addopts = -v -s 

On pytest.ini file, as in this build (with a recent commit checked out): https://travis-ci.org/vinicius0197/web-platform-tests/jobs/357204501

Without the verbose option, Travis takes too long running test_wpt (without showing output) and times out.

vinicius0197 commented 6 years ago

Ok, seems like Travis is catching the error now: https://travis-ci.org/vinicius0197/web-platform-tests/jobs/357629969#L4550

vinicius0197 commented 6 years ago

@gsnedders Sorry for not updating this those last days (I've been busy). Seems like, apart from adding a simple export RUN_JOB1=1 and travis_wait, there's not really other differences between the original Travis log and my test log.

Do you think the issue could be related to any of those commands?

sahil107 commented 6 years ago

8614

foolip commented 6 years ago

This reminds me of https://github.com/web-platform-tests/wpt/issues/7928, where port 8000 being used also resulting in misery. @gsnedders, do these two bottom out in the same code so that it's a dupe, you think?

gsnedders commented 6 years ago

No, because here we have in principle shut the previous server down.

foolip commented 6 years ago

Included in Ecosystem Infra 2018 Q4 OKRs

mdittmer commented 5 years ago

Ping from your friendly neighbourhood ecosystem infra rotation

Any updates on this, @gsnedders?

gsnedders commented 5 years ago

Not been worked on.

foolip commented 5 years ago

@gsnedders any chance of getting this fixed before the end of the quarter?

gsnedders commented 5 years ago

Given I've not even started looking into this again, no.

gsnedders commented 4 years ago

We still get more failures with something else running on port 8000, FWIW.