biglocalnews / warn-transformer

Consolidate, enrich and republish the data gathered by warn-scraper
https://warn-transformer.readthedocs.io
Apache License 2.0
4 stars 3 forks source link

'make test' fails on Linux #192

Open stucka opened 10 months ago

stucka commented 10 months ago

Tests fail in Linux, even when run from within pipenv, at what look like some pretty basic levels.

pipenv run make test
  __________
 |BIG🌲LOCAL|
 |&&& ======|
 |=== ======|  This is a Big Local News automation
 |=== == %%%|
 |[_] ======|         🤖 Running tests 🤖
 |=== ===!##|
 |__________|

=================================================================== test session starts ====================================================================
platform linux -- Python 3.9.17, pytest-7.4.0, pluggy-1.2.0
rootdir: /home/stucka/Desktop/data/warn-transformer
plugins: cov-4.1.0, vcr-1.0.2
collected 3 items                                                                                                                                          

tests/test_consolidate.py .                                                                                                                          [ 33%]
tests/test_download.py E                                                                                                                             [ 66%]
tests/test_integrate.py F                                                                                                                            [100%]

========================================================================== ERRORS ==========================================================================
_____________________________________________________________ ERROR at setup of test_download ______________________________________________________________

request = <SubRequest '_vcr_marker' for <Function test_download>>

    @pytest.fixture(autouse=True)
    def _vcr_marker(request):
        marker = request.node.get_closest_marker('vcr')
        if marker:
>           request.getfixturevalue('vcr_cassette')

../../../.local/share/virtualenvs/warn-transformer-1C5zOao7/lib/python3.9/site-packages/pytest_vcr.py:46: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../.local/share/virtualenvs/warn-transformer-1C5zOao7/lib/python3.9/site-packages/pytest_vcr.py:87: in vcr_cassette
    with vcr.use_cassette(vcr_cassette_name, **kwargs) as cassette:
../../../.local/share/virtualenvs/warn-transformer-1C5zOao7/lib/python3.9/site-packages/vcr/cassette.py:83: in __enter__
    self.__cassette = self.cls.load(**cassette_kwargs)
../../../.local/share/virtualenvs/warn-transformer-1C5zOao7/lib/python3.9/site-packages/vcr/cassette.py:171: in load
    new_cassette._load()
../../../.local/share/virtualenvs/warn-transformer-1C5zOao7/lib/python3.9/site-packages/vcr/cassette.py:344: in _load
    requests, responses = self._persister.load_cassette(self._path, serializer=self._serializer)
../../../.local/share/virtualenvs/warn-transformer-1C5zOao7/lib/python3.9/site-packages/vcr/persisters/filesystem.py:28: in load_cassette
    return deserialize(data, serializer)
../../../.local/share/virtualenvs/warn-transformer-1C5zOao7/lib/python3.9/site-packages/vcr/serialize.py:37: in deserialize
    data = serializer.deserialize(cassette_string)
../../../.local/share/virtualenvs/warn-transformer-1C5zOao7/lib/python3.9/site-packages/vcr/serializers/yamlserializer.py:12: in deserialize
    return yaml.load(cassette_string, Loader=Loader)
../../../.local/share/virtualenvs/warn-transformer-1C5zOao7/lib/python3.9/site-packages/yaml/__init__.py:81: in load
    return loader.get_single_data()
../../../.local/share/virtualenvs/warn-transformer-1C5zOao7/lib/python3.9/site-packages/yaml/constructor.py:49: in get_single_data
    node = self.get_single_node()
yaml/_yaml.pyx:673: in yaml._yaml.CParser.get_single_node
    ???
yaml/_yaml.pyx:687: in yaml._yaml.CParser._compose_document
    ???
yaml/_yaml.pyx:731: in yaml._yaml.CParser._compose_node
    ???
yaml/_yaml.pyx:845: in yaml._yaml.CParser._compose_mapping_node
    ???
yaml/_yaml.pyx:729: in yaml._yaml.CParser._compose_node
    ???
yaml/_yaml.pyx:806: in yaml._yaml.CParser._compose_sequence_node
    ???
yaml/_yaml.pyx:731: in yaml._yaml.CParser._compose_node
    ???
yaml/_yaml.pyx:845: in yaml._yaml.CParser._compose_mapping_node
    ???
yaml/_yaml.pyx:731: in yaml._yaml.CParser._compose_node
    ???
yaml/_yaml.pyx:845: in yaml._yaml.CParser._compose_mapping_node
    ???
yaml/_yaml.pyx:731: in yaml._yaml.CParser._compose_node
    ???
yaml/_yaml.pyx:847: in yaml._yaml.CParser._compose_mapping_node
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   yaml.reader.ReaderError: unacceptable character #x0080: control characters are not allowed
E     in "<unicode string>", position 1060035

yaml/_yaml.pyx:860: ReaderError
========================================================================= FAILURES =========================================================================
______________________________________________________________________ test_integrate ______________________________________________________________________

    @pytest.mark.runvcr
    @pytest.mark.vcr()
    def test_integrate():
        """Test integrate."""
        this_dir = Path(__file__).parent
        new_path = this_dir / "data" / "processed" / "consolidated.csv"
>       integrate.run(new_path, init_current_data=True)

tests/test_integrate.py:14: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
warn_transformer/integrate.py:34: in run
    current_data_list = get_current_data(init_current_data)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

init = True

    def get_current_data(init: bool = False) -> typing.List[typing.Dict[str, typing.Any]]:
        """Fetch the most recent published version of our integrated dataset.

        Args:
            init (bool): Set to True when you want to create a new integrated dataset from scratch. Default False.

        Returns a list of dictionaries ready for comparison against the new consolidated data file.
        """
        # Set which file to pull
        base_url = "https://raw.githubusercontent.com/biglocalnews/warn-github-flow/transformer/data/warn-transformer/processed/"
        if init:
            current_url = f"{base_url}consolidated.csv"
            logger.debug(f"Initializing new current file from {current_url}")
        else:
            current_url = f"{base_url}integrated.csv"
            logger.debug(f"Downloading most recent current file from {current_url}")

        # Download the current database
        current_r = requests.get(current_url)
>       current_data_str = current_r.content.decode("utf-8")
E       UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

warn_transformer/integrate.py:295: UnicodeDecodeError
===================================================================== warnings summary =====================================================================
tests/test_consolidate.py::test_consolidate
  /home/stucka/Desktop/data/warn-transformer/warn_transformer/schema.py:18: RemovedInMarshmallow4Warning: Passing field metadata as keyword arguments is deprecated. Use the explicit `metadata=...` argument instead. Additional metadata: {'max_length': 2}
    postal_code = fields.Str(max_length=2, required=True)

tests/test_consolidate.py::test_consolidate
tests/test_consolidate.py::test_consolidate
tests/test_consolidate.py::test_consolidate
  /home/stucka/.local/share/virtualenvs/warn-transformer-1C5zOao7/lib/python3.9/site-packages/marshmallow/fields.py:1176: RemovedInMarshmallow4Warning: The 'default' argument to fields is deprecated. Use 'dump_default' instead.
    super().__init__(**kwargs)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================= short test summary info ==================================================================
FAILED tests/test_integrate.py::test_integrate - UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
ERROR tests/test_download.py::test_download - yaml.reader.ReaderError: unacceptable character #x0080: control characters are not allowed
==================================================== 1 failed, 1 passed, 4 warnings, 1 error in 12.36s ===================================================
stucka commented 10 months ago

And yet Github's actually ... doing the same core OS for the same tests.

Operating System Ubuntu 22.04.3 LTS