Move most URL transform tests to text-based files

bollwyvl commented 2 weeks ago

references

noted on #2794

motivation

The hardcoded inline fixtures are long, and the tricks required to appease black (split strings, magic comments) don't help readability.

design ideas

Add a conftest.py fixture:

URL_TRANFORMS = Path(__file__) / "url-transforms"
ALL_GIVEN_URLS = [*URL_TRANFORMS.glob("*.given.txt")]

def _read_lines(path: Path):
    return [
        line.strip() for line in path.read_text().splitlines() 
        if line.strip() and not line.strip().startswith("#")
    ]

@pytest.fixture(request=[p.name for p in ALL_GIVEN_URLS])  # this gives better logs than Path-1
def a_url_with_transformed_urls(request: pytest.FixtureRequest) -> tuple[str, set[set]]:
    url = _read_lines(URL_TRANFORMS / request.param))[0]
    expected = {*_read_lines(URL_TRANFORMS / request.param.replace(".given.", ".expect."))}
    return url, expected

Use:

def test_transform_url(a_url_with_transformed_urls: tuple[str, set[str]]) -> None:
    url, expected = a_url_with_mangled_urls
    transformed = {*gen_transformed_urls(url)}
    assert transformed == expected

With:

tests/
  conftest.py
  url-mangling/
    some-descriptive-name.given.txt
    some-descriptive-name.expect.txt

Where:

# some-descriptive-name.given.txt
# maybe another ignored comment
https://foo/bar/baz-boo.tar.gz

# some-descriptive-name.expect.txt
https://foo/bar/baz-boo.tar.gz

# another bunch after some ignored whitespace and comments
https://foo/bar/baz_boo.tar.gz

beckermr commented 2 weeks ago

Yeah I'm pretty indifferent on this one. If you want to make the change, do feel free.

OTOH a few fmt: off/on lines and some noqas to ignore line lengths would probably fix the readability without generally opaque (to me) test fixtures and ways to generate tests in pytest.

beckermr commented 2 weeks ago

In other words, black + ruff are wrong in some cases and maybe we should just tell them that rather than making more complicated code.

bollwyvl commented 2 weeks ago

Sure, an alternative would be:

lines = {*"""
    https://...
""".splitlines().map(lambda x: x.strip())}

i don't think the linters care about big-ol-triple-quote-strings.

bollwyvl commented 2 weeks ago

Something like this appears to appease the linters (haven't run):

https://gist.github.com/bollwyvl/8c1c5792727c72f4d36bc41e85e6650f

beckermr commented 2 weeks ago

Sure. I don't find the current code so bad that it warrants cleanup but do feel free to do so.

regro / cf-scripts