ESSS / pytest-regressions

Pytest plugin for regression testing: https://pytest-regressions.readthedocs.io
MIT License
179 stars 35 forks source link

parameterize: 11x more temp files created than normal #172

Closed chapmanjacobd closed 2 weeks ago

chapmanjacobd commented 3 weeks ago

I really like this pytest plugin but after one run 55 tests create 3,027 folders/files (11MiB!)

$ trash-put tests/text/test_timestamps/ /tmp/pytest-of-xk/
$ pytest tests/text/test_timestamps.py --regen-all
$ ls tests/text/test_timestamps/* | count
55
$ ncdu /tmp/pytest-of-xk/

--- /tmp/pytest-of-xk/pytest-0 -----------------------------------------------------------------------------------------------------------------------------------------------------------------
                                          /..
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s23
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s22
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s21
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s20
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s19
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s18
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s17
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s16
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s15
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s14
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s13
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s12
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s11
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s10
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s9
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s8
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s7
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s6
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s5
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s4
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s3
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s2
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s1
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f__s0
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ23
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ22
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ21
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ20
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ19
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ18
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ17
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ16
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ15
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ14
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ13
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ12
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ11
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ10
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ9
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ8
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ7
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ6
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ5
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ4
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ3
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ2
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ1
  216.0 KiB [###########################] /test_lb_timestamps_tz_utc_f_TZ0
  216.0 KiB [###########################] /test_lb_timestamps_tz_unix_f_U5
  216.0 KiB [###########################] /test_lb_timestamps_tz_unix_f_U4
*Total disk usage:  11.4 MiB   Apparent size: 107.4 KiB   Items: 3027

rmlint says most of these are duplicates

==> Note: Please use the saved script below for removal, not the above output.
==> In total 2919 files, whereof 2899 are duplicates in 17 groups.
==> This equals 43.59 KB of duplicates which could be removed.
==> Scanning took in total 0.125s.

Also, it is a bit weird that apparent size and actual disk usage differ so much. I guess that is because 4kb is the minimum size on my filesystem (3000*4kib ~ 12 MiB so that part makes sense... actually)

12 MiB seems fine but there seems to be an exponential or quadratic bug because it quickly became 3 GiB even though I had run pytest only a dozen or so times

I guess this is a bug/interaction with pytest.mark.parameterize?

@pytest.mark.parametrize("p", [["1970-01-01 00:00:01"], ["--from-unix", '1']])
@pytest.mark.parametrize("fz", [["-fz", 'America/New_York'], ["-fz", 'America/Chicago']])
@pytest.mark.parametrize("tz", [['-tz', 'America/New_York'], ['-tz', 'America/Chicago']])
@pytest.mark.parametrize("s", [[], ['-d'], ['-t']])
@pytest.mark.parametrize("f", [[], ['-TZ']])
def test_lb_timestamps_tz_utc(data_regression, p, fz, tz, s, f, capsys):
    lb(["timestamps"] + p + fz + tz + s + f)
    captured = capsys.readouterr().out.strip()
    data_regression.check(captured)

edit: I've verified that this only happens when using pytest.mark.parametrize and in this case pytest-regressions creates 11x more files than it does with normal tests. This matches the number of parameters:

@>>> len([["1970-01-01 00:00:01"], ["--from-unix", '1']] + [["-fz", 'America/New_York'], ["-fz", 'America/Chicago']] + [['-tz', 'America/New_York'], ['-tz', 'America/Chicago']] +[[], ['-d'], ['-t']] +[[], ['-TZ']])
11
nicoddemus commented 2 weeks ago

Hi @chapmanjacobd

pytest-regressions uses pytest-datadir behind the scenes, and indeed it does create a separate temporary directory for each test run, so it is working as intended.

I'm closing this for now but feel free to follow up with more questions.

chapmanjacobd commented 2 weeks ago

I'm fine with it creating one temp dir per test run (or even per test) but this is a bug specific to pytest.mark.parametrize

nicoddemus commented 2 weeks ago

What do you mean? It should create a new temporary directory per test run (not per test mind you).

If you think this is not what is happening, perhaps it is worth posting an issue in pytest-datadir (but with a MWE using pytest-datadir only).

nicoddemus commented 2 weeks ago

edit: I've verified that this only happens when using pytest.mark.parametrize and in this case pytest-regressions creates 11x more files than it does with normal tests. This matches the number of parameters:

To be clear, this is expected and working as intended: each parametrize parameter will create a separate "test run", and pytest-datadir will create a separate directory for each.

chapmanjacobd commented 2 weeks ago

okay, you are right it is the same

import pytest
import os

def test_without_parameters1(data_regression):
    data = {"key": "value_1"}
    data_regression.check(data)

def test_without_parameters2(data_regression):
    data = {"key": "value_2"}
    data_regression.check(data)

def test_without_parameters3(data_regression):
    data = {"key": "value_3"}
    data_regression.check(data)

def test_without_parameters4(data_regression):
    data = {"key": "value_4"}
    data_regression.check(data)

def test_without_parameters5(data_regression):
    data = {"key": "value_5"}
    data_regression.check(data)

'''
tree /tmp/pytest-of-xk/
/tmp/pytest-of-xk/
├── pytest-0
│   ├── test_without_parameters10
│   │   └── params
│   │       └── test_without_parameters3.yml
│   ├── test_without_parameters1current -> /tmp/pytest-of-xk/pytest-0/test_without_parameters10
│   ├── test_without_parameters20
│   │   └── params
│   │       ├── test_without_parameters1.yml
│   │       ├── test_without_parameters3.yml
│   │       └── test_without_parameters4.yml
│   ├── test_without_parameters2current -> /tmp/pytest-of-xk/pytest-0/test_without_parameters20
│   ├── test_without_parameters30
│   │   └── params
│   ├── test_without_parameters3current -> /tmp/pytest-of-xk/pytest-0/test_without_parameters30
│   ├── test_without_parameters40
│   │   └── params
│   │       ├── test_without_parameters1.yml
│   │       └── test_without_parameters3.yml
│   ├── test_without_parameters4current -> /tmp/pytest-of-xk/pytest-0/test_without_parameters40
│   ├── test_without_parameters50
│   │   └── params
│   │       ├── test_without_parameters1.yml
│   │       ├── test_without_parameters2.yml
│   │       ├── test_without_parameters3.yml
│   │       └── test_without_parameters4.yml
│   └── test_without_parameters5current -> /tmp/pytest-of-xk/pytest-0/test_without_parameters50
├── pytest-1
│   ├── test_without_parameters10
│   │   └── params
│   │       ├── test_without_parameters1.obtained.yml
│   │       ├── test_without_parameters1.yml
│   │       ├── test_without_parameters2.yml
│   │       ├── test_without_parameters3.yml
│   │       ├── test_without_parameters4.yml
│   │       └── test_without_parameters5.yml
│   ├── test_without_parameters1current -> /tmp/pytest-of-xk/pytest-1/test_without_parameters10
│   ├── test_without_parameters20
│   │   └── params
│   │       ├── test_without_parameters1.yml
│   │       ├── test_without_parameters2.obtained.yml
│   │       ├── test_without_parameters2.yml
│   │       ├── test_without_parameters3.yml
│   │       ├── test_without_parameters4.yml
│   │       └── test_without_parameters5.yml
│   ├── test_without_parameters2current -> /tmp/pytest-of-xk/pytest-1/test_without_parameters20
│   ├── test_without_parameters30
│   │   └── params
│   │       ├── test_without_parameters1.yml
│   │       ├── test_without_parameters2.yml
│   │       ├── test_without_parameters3.obtained.yml
│   │       ├── test_without_parameters3.yml
│   │       ├── test_without_parameters4.yml
│   │       └── test_without_parameters5.yml
│   ├── test_without_parameters3current -> /tmp/pytest-of-xk/pytest-1/test_without_parameters30
│   ├── test_without_parameters40
│   │   └── params
│   │       ├── test_without_parameters1.yml
│   │       ├── test_without_parameters2.yml
│   │       ├── test_without_parameters3.yml
│   │       ├── test_without_parameters4.obtained.yml
│   │       ├── test_without_parameters4.yml
│   │       └── test_without_parameters5.yml
│   ├── test_without_parameters4current -> /tmp/pytest-of-xk/pytest-1/test_without_parameters40
│   ├── test_without_parameters50
│   │   └── params
│   │       ├── test_without_parameters1.yml
│   │       ├── test_without_parameters2.yml
│   │       ├── test_without_parameters3.yml
│   │       ├── test_without_parameters4.yml
│   │       ├── test_without_parameters5.obtained.yml
│   │       └── test_without_parameters5.yml
│   └── test_without_parameters5current -> /tmp/pytest-of-xk/pytest-1/test_without_parameters50
└── pytest-current -> /tmp/pytest-of-xk/pytest-1

34 directories, 40 files
'''

vs

import pytest
import os

@pytest.mark.parametrize("f", range(0, 5))
def test_with_parameters(data_regression, f):
    data = {"key": f"value_{f}"}
    data_regression.check(data)

'''
tree /tmp/pytest-of-xk/
/tmp/pytest-of-xk/
├── pytest-0
│   ├── test_with_parameters_0_0
│   │   └── params
│   ├── test_with_parameters_0_current -> /tmp/pytest-of-xk/pytest-0/test_with_parameters_0_0
│   ├── test_with_parameters_1_0
│   │   └── params
│   │       ├── test_with_parameters_0_.yml
│   │       ├── test_with_parameters_2_.yml
│   │       └── test_with_parameters_3_.yml
│   ├── test_with_parameters_1_current -> /tmp/pytest-of-xk/pytest-0/test_with_parameters_1_0
│   ├── test_with_parameters_2_0
│   │   └── params
│   │       ├── test_with_parameters_0_.yml
│   │       └── test_with_parameters_3_.yml
│   ├── test_with_parameters_2_current -> /tmp/pytest-of-xk/pytest-0/test_with_parameters_2_0
│   ├── test_with_parameters_3_0
│   │   └── params
│   │       └── test_with_parameters_0_.yml
│   ├── test_with_parameters_3_current -> /tmp/pytest-of-xk/pytest-0/test_with_parameters_3_0
│   ├── test_with_parameters_4_0
│   │   └── params
│   │       ├── test_with_parameters_0_.yml
│   │       ├── test_with_parameters_1_.yml
│   │       ├── test_with_parameters_2_.yml
│   │       └── test_with_parameters_3_.yml
│   └── test_with_parameters_4_current -> /tmp/pytest-of-xk/pytest-0/test_with_parameters_4_0
├── pytest-1
│   ├── test_with_parameters_0_0
│   │   └── params
│   │       ├── test_with_parameters_0_.obtained.yml
│   │       ├── test_with_parameters_0_.yml
│   │       ├── test_with_parameters_1_.yml
│   │       ├── test_with_parameters_2_.yml
│   │       ├── test_with_parameters_3_.yml
│   │       └── test_with_parameters_4_.yml
│   ├── test_with_parameters_0_current -> /tmp/pytest-of-xk/pytest-1/test_with_parameters_0_0
│   ├── test_with_parameters_1_0
│   │   └── params
│   │       ├── test_with_parameters_0_.yml
│   │       ├── test_with_parameters_1_.obtained.yml
│   │       ├── test_with_parameters_1_.yml
│   │       ├── test_with_parameters_2_.yml
│   │       ├── test_with_parameters_3_.yml
│   │       └── test_with_parameters_4_.yml
│   ├── test_with_parameters_1_current -> /tmp/pytest-of-xk/pytest-1/test_with_parameters_1_0
│   ├── test_with_parameters_2_0
│   │   └── params
│   │       ├── test_with_parameters_0_.yml
│   │       ├── test_with_parameters_1_.yml
│   │       ├── test_with_parameters_2_.obtained.yml
│   │       ├── test_with_parameters_2_.yml
│   │       ├── test_with_parameters_3_.yml
│   │       └── test_with_parameters_4_.yml
│   ├── test_with_parameters_2_current -> /tmp/pytest-of-xk/pytest-1/test_with_parameters_2_0
│   ├── test_with_parameters_3_0
│   │   └── params
│   │       ├── test_with_parameters_0_.yml
│   │       ├── test_with_parameters_1_.yml
│   │       ├── test_with_parameters_2_.yml
│   │       ├── test_with_parameters_3_.obtained.yml
│   │       ├── test_with_parameters_3_.yml
│   │       └── test_with_parameters_4_.yml
│   ├── test_with_parameters_3_current -> /tmp/pytest-of-xk/pytest-1/test_with_parameters_3_0
│   ├── test_with_parameters_4_0
│   │   └── params
│   │       ├── test_with_parameters_0_.yml
│   │       ├── test_with_parameters_1_.yml
│   │       ├── test_with_parameters_2_.yml
│   │       ├── test_with_parameters_3_.yml
│   │       ├── test_with_parameters_4_.obtained.yml
│   │       └── test_with_parameters_4_.yml
│   └── test_with_parameters_4_current -> /tmp/pytest-of-xk/pytest-1/test_with_parameters_4_0
└── pytest-current -> /tmp/pytest-of-xk/pytest-1

34 directories, 40 files
'''

although it seems like there is a quadratic bug somewhere I will accept it as-is

thx