intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs
MIT License
124 stars 35 forks source link

[test-triton.sh] Add an option to unskip all pytest.skip() calls #987

Closed vlad-penkin closed 2 months ago

vlad-penkin commented 4 months ago

Implementation approach

  1. Create a temporary copy of the python/test folder
  2. Replace all pytest.skip() calls by pass by developing a script which use python ast module and make changes to the temporary copy
  3. Run UT on a temporary copy
  4. Unskipped run may cause slowness / hangs like behavior, a special watcher needs to be implemented. Recommended approach is to use pytest-timeout
AshburnLee commented 3 months ago

Update:

AshburnLee commented 3 months ago

Update:

AshburnLee commented 3 months ago

https://github.com/intel/intel-xpu-backend-for-triton/pull/1196 is not target for this issue

vlad-penkin commented 3 months ago

@AshburnLee let's implement unskips via conftest.py. Such approach will not require any source code manipulations / changes.

To unskip the tests, we can add the following code in conftest.py:

import pytest

def pytest_configure(config):

    def unskip(reason=None, allow_module_level=False):
        pass

    skip_f = pytest.skip
    pytest.skip = unskip
    config._skip_f = skip_f

def pytest_unconfigure(config):
    pytest.skip = config._skip_f

To run pytest with custom 'conftest.py' we can use --confcutdir=<path_to_folder_with_conftest.py> pytest parameter.

AshburnLee commented 3 months ago

It seems a better solution, let me try. thanks @vlad-penkin

AshburnLee commented 3 months ago

@vlad-penkin I tried your way, it partially works. --confcutdir is not used like you described. --confcutdir allows you to specify a directory where pytest will stop searching upwards for configuration files. And conftest.py is more like a hook. So I think conftest.py & --confcutdir are two different things that is not supposed to used together the way you described.

conftest.py will always take effect to test scripts in the current dir & all the sub-dir. So Here is my improved implementation: https://github.com/intel/intel-xpu-backend-for-triton/pull/1345

Please review, thanks

AshburnLee commented 3 months ago

@pbchekin I also supported this on CI, please have a look at .github/workflows/build-test.yml Thanks~

AshburnLee commented 2 months ago

Update:

Progress: Task starts on 6/11, review starts on 6/14, (lasted 4 workdays) Risk: pytest-timeout works not fully as expected, spent 1.5 workdays experimenting on it.

AshburnLee commented 2 months ago

Update:

Risk: unskip mode and skip mode(default) have different number of total test cases, (passed and xfail in language cases)

unskip vs default

interpreter: passed: 6222, failed: 31, skipped: 0, xfailed: 703, total: 6956, fixme: 1, pass rate (w/o xfailed): 99.5% interpreter: passed: 6221, failed: 0, skipped: 32, xfailed: 703, total: 6956, fixme: 1, pass rate (w/o xfailed): 99.49%

unskip vs default

operators: passed: 961, failed: 0, skipped: 3, xfailed: 0, total: 964, fixme: 0, pass rate (w/o xfailed): 99.69% # done 964-3=961 operators: passed: 958, failed: 0, skipped: 3, xfailed: 0, total: 961, fixme: 0, pass rate (w/o xfailed): 99.69% # 961

unskip vs default

subprocess: passed: 22, failed: 12, skipped: 12, xfailed: 0, total: 46, fixme: 0, pass rate (w/o xfailed): 47.83% subprocess: passed: 22, failed: 0, skipped: 12, xfailed: 0, total: 34, fixme: 0, pass rate (w/o xfailed): 64.71%

unskip vs default

language: passed: 9705, failed: 68, skipped: 181, xfailed: 423, total: 10377

language: passed: 10332, failed: 0, skipped: 349, xfailed: 521, total: 11202

Todo:

Investigate what was happening in language.

AshburnLee commented 2 months ago
AshburnLee commented 2 months ago

Update:

pbchekin commented 2 months ago

We don't need the list of failed tests in (private) wiki. Just run the workflow and check the "Upload test reports" checkbox.