hugovk commented 4 years ago

[x] a detailed description of the bug or suggestion

[x] output of pip list from the virtual environment you are using

Package        Version    
-------------- -----------
atomicwrites   1.3.0      
attrs          19.3.0     
colorama       0.4.3      
more-itertools 8.2.0      
packaging      20.3       
pip            20.0.2     
pluggy         0.13.1     
py             1.8.1      
pyparsing      2.4.6      
pytest         5.3.5      
setuptools     41.2.0     
six            1.14.0     
ujson          1.36.dev102
wcwidth        0.1.8

[x] pytest and operating system versions pytest 5.3.5, Windows Server 2019 on GitHub Actions Python 3.5-3.8
[x] minimal example if possible

Include long test input in a parametrize test:

@pytest.mark.parametrize(
    "test_input",
    [
        "1",
        "2",
        "[" * (1024 * 1024),
        "{" * (1024 * 1024),
    ],
)
def test_long_input(test_input):
    # Do something with test_input
    pass

Expected

Tests pass

Actual

Tests fail with ValueError: the environment variable is longer than 32767 characters:

2020-03-08T21:44:22.8972828Z key = 'PYTEST_CURRENT_TEST'
2020-03-08T21:44:22.8973045Z value = 'tests/test_ujson.py::test_long_input[{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{...{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{] (teardown)'
2020-03-08T21:44:22.8973138Z 
2020-03-08T21:44:22.8973265Z     def __setitem__(self, key, value):
2020-03-08T21:44:22.8973624Z         key = self.encodekey(key)
2020-03-08T21:44:22.8973756Z         value = self.encodevalue(value)
2020-03-08T21:44:22.8973879Z >       self.putenv(key, value)
2020-03-08T21:44:22.8974020Z E       ValueError: the environment variable is longer than 32767 characters
2020-03-08T21:44:22.8974117Z 
2020-03-08T21:44:22.8974254Z c:\hostedtoolcache\windows\python\3.8.2\x64\lib\os.py:681: ValueError
2020-03-08T21:44:22.8974408Z ======================== 138 passed, 4 errors in 2.66s ========================
2020-03-08T21:44:22.9037130Z ##[error]Process completed with exit code 1.

https://github.com/hugovk/ultrajson/runs/493853892?check_suite_focus=true

Passes on Ubuntu 16.04, 18.04, macOS Catalina 10.15 with Python 2.7, 3.5-3.8, but for all operating systems the test name is also really long because it includes the parameterised values, which clutters the logs.

Can be worked around by splitting the long test input into its own method:

@pytest.mark.parametrize(
    "test_input",
    [
        "1",
        "2",
    ],
)
def test_long_input(test_input):
    # Do something with test_input
    pass

@pytest.mark.parametrize(
    "test_input",
    [
        "[",
        "{",
    ],
)
def test_long_input(test_input):
    # Do something with test_input * (1024 * 1024), instead of test_input
    pass

https://github.com/hugovk/ultrajson/commit/f9a4f42efe3d719ec3aa15f63c07852e632f0b40

Zac-HD commented 4 years ago

Re: long name in logs, passing ids=... to parametrize would allow you to customize it to a shorter form.

(which might also work around the env issue? unclear, but we should fix that anyway.)

hugovk commented 4 years ago

Thanks, passing ids=... works around the env issue too:

@pytest.mark.parametrize(
    "test_input",
    [
        "1",
        "2",
        "[" * (1024 * 1024),
        "{" * (1024 * 1024),
    ],
    ids=["a", "b", "c", "d"]
)
def test_long_input(test_input):
    pass

(Although there are 30 params in the real test, so I'd opt for the method splitting workaround in this case.)

The-Compiler commented 4 years ago

(Although there are 30 params in the real test, so I'd opt for the method splitting workaround in this case.)

Note that you can pass a function to ids instead (docs). Thus, you could do something like ids=lambda s: s[:10] (assuming that your IDs are still unique after that).

RonnyPfannschmidt commented 4 years ago

i believe this is one of the c ases where pytest should not consider the string as valid id and hint and pasing a explicit name, autogenerating the test name instead

aka "{"*10000 should result in a warning and a autogenerated test id the suggestion should be to use pytest.param(""{"*10000, id="intent-of-the-input"

symonk commented 4 years ago

@RonnyPfannschmidt / @nicoddemus any recommendations on where the fix should live? but where do we draw the line here? a few queries from my initial investigation:

_pytest/python.py (add some checks around validating ids here - could you recommend where this issue should typically be solved?
Do we auto generate a random name after a certain amount? checking length of nodeid plus the parametrized data (plus various bolts on like '[] teardown' to account for worse case scenario when updating PYTEST_CURRENT_TEST in environ are problematic.
Do we say, if your value is over 'X' in length, we will auto generate one for you (shorter - only on windows) and alert you to the fact? I'm not sure how potentially breaking that is tho to current windows tests as some may be using slightly longer. my initial thought was if the data is > 1024 we will rewrite it but ultimately from what I gather the entire environment path on windows cannot exceed the 32.7K limit so part of me feels like its a futile effort of a fix in vain because if you just do enough shorter ones, same problem will persist?

RonnyPfannschmidt commented 4 years ago

I would draw the line around 100, maybe less

symonk commented 4 years ago

ok I will mock something up and discuss via PR, I think we have a few areas where maybe we should consider a --windows based flag that does a couple of things, 2-3 issue's ive seen are similar in nature to this but in different parts of the system, relating to file lengths or env var lengths etc :)

ItsDrike commented 1 year ago

Any progress on this issue? We were still able to replicate it with latest pytest (3 years later). Was this just forgotten about, or is it a wontfix?

Even worse, it seems that doing:

@pytest.mark.skipif(platform.system() == "Windows", reason="environment variable limit on Windows")
@pytest.mark.parametrize(("string"), ["a" * (32768)])
def test_write_utf_limit(string):
    ...

Causes the test to be skipped, but also failed somehow. I suspect that it's because the error occurred during parametrization, so even though the test function didn't actually run, it still failed. See: commit that caused this, along with the corresponding failure in it's CI run

Although the suggested fix by setting ids does work, it seems like something that should be addressed, if possible.

nicoddemus commented 1 year ago

Nobody took the time to fix it, but a pull request would be certainly welcome.

RonnyPfannschmidt commented 1 year ago

The open pull request went stale it seems

obestwalter commented 5 months ago

@symonk - the discussion here points to following a different approach, so I think it's ok to close this one then, right?

kurtmckee commented 5 months ago

@obestwalter I opened this issue, and @symonk opened a PR linked to this issue.

Is "close this one" referring to the PR linked to this issue, or is it referring to this issue?

obestwalter commented 5 months ago

@kurtmckee yes it's about the PR that is not the preferred solution anymore. I actually should have written it there. Thanks for clearing that up.

obestwalter commented 5 months ago

Thanks @kurtmckee for picking this up, let's move the discussion back here then as the first PR to address this is likely to be closed soon then.

You wrote:

Detect long IDs (cross-platform for consistency)

Hash the IDs to avoid platform-specific length restrictions

Issue a warning suggesting ways to choose IDs independently

My head is spinning from going through tons of Issues and PRs over the last few days after not really having followed the developments for a long time. But from my shallow understanding doing it like this is fixing the problem and letting the user know that something potentially surprising has been done to address it, giving them the info needed to handle it differently if they so wish. So sounds like a good plan to me.

pytest-dev / pytest

Long parametrized test_input on Windows: ValueError: the environment variable is longer than 32767 characters #6881

Expected

Actual