pytest-dev / pytest

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
https://pytest.org
MIT License
12.2k stars 2.7k forks source link

Long parametrized test_input on Windows: ValueError: the environment variable is longer than 32767 characters #6881

Open hugovk opened 4 years ago

hugovk commented 4 years ago

Include long test input in a parametrize test:

@pytest.mark.parametrize(
    "test_input",
    [
        "1",
        "2",
        "[" * (1024 * 1024),
        "{" * (1024 * 1024),
    ],
)
def test_long_input(test_input):
    # Do something with test_input
    pass

Expected

Tests pass

Actual

Tests fail with ValueError: the environment variable is longer than 32767 characters:

2020-03-08T21:44:22.8972828Z key = 'PYTEST_CURRENT_TEST'
2020-03-08T21:44:22.8973045Z value = 'tests/test_ujson.py::test_long_input[{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{...{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{] (teardown)'
2020-03-08T21:44:22.8973138Z 
2020-03-08T21:44:22.8973265Z     def __setitem__(self, key, value):
2020-03-08T21:44:22.8973624Z         key = self.encodekey(key)
2020-03-08T21:44:22.8973756Z         value = self.encodevalue(value)
2020-03-08T21:44:22.8973879Z >       self.putenv(key, value)
2020-03-08T21:44:22.8974020Z E       ValueError: the environment variable is longer than 32767 characters
2020-03-08T21:44:22.8974117Z 
2020-03-08T21:44:22.8974254Z c:\hostedtoolcache\windows\python\3.8.2\x64\lib\os.py:681: ValueError
2020-03-08T21:44:22.8974408Z ======================== 138 passed, 4 errors in 2.66s ========================
2020-03-08T21:44:22.9037130Z ##[error]Process completed with exit code 1.

https://github.com/hugovk/ultrajson/runs/493853892?check_suite_focus=true

Passes on Ubuntu 16.04, 18.04, macOS Catalina 10.15 with Python 2.7, 3.5-3.8, but for all operating systems the test name is also really long because it includes the parameterised values, which clutters the logs.

Can be worked around by splitting the long test input into its own method:

@pytest.mark.parametrize(
    "test_input",
    [
        "1",
        "2",
    ],
)
def test_long_input(test_input):
    # Do something with test_input
    pass

@pytest.mark.parametrize(
    "test_input",
    [
        "[",
        "{",
    ],
)
def test_long_input(test_input):
    # Do something with test_input * (1024 * 1024), instead of test_input
    pass
Zac-HD commented 4 years ago

Re: long name in logs, passing ids=... to parametrize would allow you to customize it to a shorter form.

(which might also work around the env issue? unclear, but we should fix that anyway.)

hugovk commented 4 years ago

Thanks, passing ids=... works around the env issue too:

@pytest.mark.parametrize(
    "test_input",
    [
        "1",
        "2",
        "[" * (1024 * 1024),
        "{" * (1024 * 1024),
    ],
    ids=["a", "b", "c", "d"]
)
def test_long_input(test_input):
    pass

(Although there are 30 params in the real test, so I'd opt for the method splitting workaround in this case.)

The-Compiler commented 4 years ago

(Although there are 30 params in the real test, so I'd opt for the method splitting workaround in this case.)

Note that you can pass a function to ids instead (docs). Thus, you could do something like ids=lambda s: s[:10] (assuming that your IDs are still unique after that).

RonnyPfannschmidt commented 4 years ago

i believe this is one of the c ases where pytest should not consider the string as valid id and hint and pasing a explicit name, autogenerating the test name instead

aka "{"*10000 should result in a warning and a autogenerated test id the suggestion should be to use pytest.param(""{"*10000, id="intent-of-the-input"

symonk commented 4 years ago

@RonnyPfannschmidt / @nicoddemus any recommendations on where the fix should live? but where do we draw the line here? a few queries from my initial investigation:

RonnyPfannschmidt commented 4 years ago

I would draw the line around 100, maybe less

symonk commented 4 years ago

ok I will mock something up and discuss via PR, I think we have a few areas where maybe we should consider a --windows based flag that does a couple of things, 2-3 issue's ive seen are similar in nature to this but in different parts of the system, relating to file lengths or env var lengths etc :)

ItsDrike commented 1 year ago

Any progress on this issue? We were still able to replicate it with latest pytest (3 years later). Was this just forgotten about, or is it a wontfix?

Even worse, it seems that doing:

@pytest.mark.skipif(platform.system() == "Windows", reason="environment variable limit on Windows")
@pytest.mark.parametrize(("string"), ["a" * (32768)])
def test_write_utf_limit(string):
    ...

Causes the test to be skipped, but also failed somehow. I suspect that it's because the error occurred during parametrization, so even though the test function didn't actually run, it still failed. See: commit that caused this, along with the corresponding failure in it's CI run

Although the suggested fix by setting ids does work, it seems like something that should be addressed, if possible.

nicoddemus commented 1 year ago

Nobody took the time to fix it, but a pull request would be certainly welcome.

RonnyPfannschmidt commented 1 year ago

The open pull request went stale it seems

obestwalter commented 5 months ago

@symonk - the discussion here points to following a different approach, so I think it's ok to close this one then, right?

kurtmckee commented 5 months ago

@obestwalter I opened this issue, and @symonk opened a PR linked to this issue.

Is "close this one" referring to the PR linked to this issue, or is it referring to this issue?

obestwalter commented 5 months ago

@kurtmckee yes it's about the PR that is not the preferred solution anymore. I actually should have written it there. Thanks for clearing that up.

obestwalter commented 5 months ago

Thanks @kurtmckee for picking this up, let's move the discussion back here then as the first PR to address this is likely to be closed soon then.

You wrote:

  1. Detect long IDs (cross-platform for consistency)
  2. Hash the IDs to avoid platform-specific length restrictions
  3. Issue a warning suggesting ways to choose IDs independently

My head is spinning from going through tons of Issues and PRs over the last few days after not really having followed the developments for a long time. But from my shallow understanding doing it like this is fixing the problem and letting the user know that something potentially surprising has been done to address it, giving them the info needed to handle it differently if they so wish. So sounds like a good plan to me.