earthobservations / wetterdienst

Open weather data for humans.
https://wetterdienst.readthedocs.io/
MIT License
367 stars 55 forks source link

CI: Collection of flukes #816

Open amotl opened 1 year ago

amotl commented 1 year ago

Hi there,

within this issue, we are collecting some observations of flaky behavior on CI. It is meant to get the big picture, so that we can improve the robustness of the test suite gradually, by identifying the bad spots. To be able to do that, it is important to diligently record all observations here.

Most of the errors will be about concurrent file access going south, where specific tests are not appropriately marked with cflake, and the parallel testing based on pytest-xdist will hit concurrency issues.

With kind regards, Andreas.

amotl commented 1 year ago

Exception

zipfile.BadZipFile: Truncated file header

Location

FAILED tests/ui/test_cli.py::test_cli_values_excel[dwd-observation---resolution=daily --parameter=kl --period=recent --date=2020-06-30-01048-Dresden-Klotzsche]

References

amotl commented 1 year ago

Exception

        elif self._compress_type == ZIP_DEFLATED:
            n = max(n, self.MIN_READ_SIZE)
>           data = self._decompressor.decompress(data, n)
E           zlib.error: Error -3 while decompressing data: invalid stored block lengths

Location

FAILED tests/provider/dwd/observation/test_api_data.py::test_dwd_observation_data_result_missing_data

References

amotl commented 1 year ago

Exception

dash.testing.errors.DashAppLoadingError: threaded server failed to start

Location

ERROR tests/ui/explorer/test_explorer.py::test_app_layout - dash.testing.errors.DashAppLoadingError: threaded server failed to start
ERROR tests/ui/explorer/test_explorer.py::test_app_data_stations_failed - dash.testing.errors.DashAppLoadingError: threaded server failed to start
ERROR tests/ui/explorer/test_explorer.py::test_options_reset - dash.testing.errors.DashAppLoadingError: threaded server failed to start
ERROR tests/ui/explorer/test_explorer.py::test_app_data_values - dash.testing.errors.DashAppLoadingError: threaded server failed to start
ERROR tests/ui/explorer/test_explorer.py::test_dwd_mosmix_options - dash.testing.errors.DashAppLoadingError: threaded server failed to start

References

amotl commented 1 year ago

Exception

FileNotFoundError: ['https://hubeau.eaufrance.fr/api/v1/hydrometrie/referentiel/stations?format=json&en_service=true']

Location

FAILED tests/test_api.py::test_api[False-eaufrance-hubeau-kwargs6-None] - FileNotFoundError: ['https://hubeau.eaufrance.fr/api/v1/hydrometrie/referentiel/stations?format=json&en_service=true']
FAILED tests/test_api.py::test_api[True-eaufrance-hubeau-kwargs6-None] - FileNotFoundError: ['https://hubeau.eaufrance.fr/api/v1/hydrometrie/referentiel/stations?format=json&en_service=true']

References


Investigation

The API is undergoing regular maintenance, or unplanned outage.

$ http "https://hubeau.eaufrance.fr/api/v1/hydrometrie/referentiel/stations?format=json&en_service=true"
HTTP/1.1 500 Internal Server Error
Access-Control-Allow-Origin: *
Connection: close
Content-Encoding: gzip
Content-Length: 78
Content-Type: application/json
Date: Sat, 03 Dec 2022 01:59:56 GMT
Server: Apache
Vary: Origin,Access-Control-Request-Method,Access-Control-Request-Headers,Accept-Encoding

{
    "code": "Internal server error",
    "field_errors": null,
    "message": ""
}
amotl commented 1 year ago

_ test_cliinterpolate

BadZipFile("Bad CRC-32 for file 'produkt_klima_tag_19470101_19730228_05105.txt'")

-- https://github.com/earthobservations/wetterdienst/actions/runs/3719301178/jobs/6308072641#step:7:649

amotl commented 1 year ago

_ testapi[True-dwd-observation-kwargs0-None]

E           zipfile.BadZipFile: The archive of https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/daily/kl/recent/tageswerte_KL_00011_akt.zip seems to be corrupted.

wetterdienst/provider/dwd/observation/download.py:62: BadZipFile

-- https://github.com/earthobservations/wetterdienst/actions/runs/3777614414/jobs/6421537572#step:7:889

amotl commented 1 year ago

__ test_api[False-wsv-pegel-kwargs3-None] __

ValueError: Unexpected character found when decoding object value

__ testapi[False-wsv-pegel-kwargs3-None] ____

ValueError: Unmatched ''"' when when decoding 'string'

_ testapi[False-dwd-mosmix-kwargs1-None]

ValueError: could not convert string to float: '\x00\x00\x00\x00\x00\x00\x00'

-- https://github.com/earthobservations/wetterdienst/actions/runs/3777758460/jobs/6421748972


_ testapi[True-ea-hydrology-kwargs4-None]

json.decoder.JSONDecodeError: Invalid control character at: line 1 column 351655 (char 351654)

-- https://github.com/earthobservations/wetterdienst/actions/runs/3777796175/jobs/6421804052#step:7:172

amotl commented 1 year ago

____ test_export_parquet ___

 >       os.unlink(filename)
E       PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\runneradmin\\AppData\\Local\\Temp\\pytest-of-runneradmin\\pytest-0\\popen-gw1\\data1\\observation.parquet'

tests\provider\dwd\observation\test_io.py:389: PermissionError

-- https://github.com/earthobservations/wetterdienst/actions/runs/3777949086/jobs/6422030473#step:7:618

amotl commented 1 year ago

_____ test_dwd_summarize ____

E       KeyError: 'data'

tests/ui/test_restapi.py:325: KeyError

-- https://github.com/earthobservations/wetterdienst/actions/runs/3781348080/jobs/6485175492#step:7:601

amotl commented 1 year ago
FAILED tests/example/test_notebook_examples.py::test_jupyter_example - nbclient.exceptions.CellTimeoutError: A cell timed out while it was being executed, after 60 seconds.
The message was: Cell execution timed out.
Here is a preview of the cell contents:
-------------------
['from pprint import pprint', '', 'from wetterdienst.provider.dwd.observation import (', '    DwdObservationRequest,', '    DwdObservationPeriod,']
...
[')', '', 'import matplotlib as mpl', 'import matplotlib.pyplot as plt', 'from matplotlib import cm']
-------------------
= 1 failed, 272 passed, 11 skipped, 3 xfailed, 56 warnings in 664.62s (0:11:04) =

-- https://github.com/earthobservations/wetterdienst/actions/runs/3961221008/jobs/6786397006#step:7:905

amotl commented 1 year ago
FAILED tests/core/scalar/test_summary.py::test_summary_temperature_air_mean_200_daily - AssertionError: DataFrame.iloc[:, 2] (column name="value") are different

DataFrame.iloc[:, 2] (column name="value") values are different (100.0 %)
[index]: [0, 1, 2]
[left]:  [272.95, 266.15, 268.75]
[right]: [273.65, 267.65, 270.45]
= 1 failed, 272 passed, 11 skipped, 3 xfailed, 57 warnings in 475.06s (0:07:55) =

-- https://github.com/earthobservations/wetterdienst/actions/runs/3961221008/jobs/6786764303#step:7:715

amotl commented 1 year ago
FAILED tests/core/scalar/test_summary.py::test_summary_temperature_air_mean_200_daily - AssertionError: DataFrame.iloc[:, 2] (column name="value") are different

DataFrame.iloc[:, 2] (column name="value") values are different (100.0 %)
[index]: [0, 1, 2]
[left]:  [nan, nan, 270.25]
[right]: [273.65, 267.65, 270.45]
= 1 failed, 272 passed, 11 skipped, 3 xfailed, 58 warnings in 580.97s (0:09:40) =

-- https://github.com/earthobservations/wetterdienst/actions/runs/3972716674/jobs/6810861277#step:7:716

amotl commented 1 year ago
=========================== short test summary info ============================
FAILED tests/ui/test_restapi.py::test_dwd_interpolate - KeyError: 'data'
FAILED tests/ui/test_restapi.py::test_dwd_summarize - KeyError: 'data'
= 2 failed, 269 passed, 12 skipped, 3 xfailed, 49 warnings in 779.51s (0:12:59) =

-- https://github.com/earthobservations/wetterdienst/actions/runs/3973028291/jobs/6811413462#step:7:851

amotl commented 1 year ago
FAILED tests/core/scalar/test_summary.py::test_summary_temperature_air_mean_200_daily - AssertionError: DataFrame.iloc[:, 2] (column name="value") are different

DataFrame.iloc[:, 2] (column name="value") values are different (100.0 %)
[index]: [0, 1, 2]
[left]:  [272.95, 266.15, 268.75]
[right]: [273.65, 267.65, 270.45]
= 1 failed, 271 passed, 11 skipped, 3 xfailed, 50 warnings in 543.33s (0:09:03) =
Error: Process completed with exit code 1.

-- https://github.com/earthobservations/wetterdienst/actions/runs/3976354893/jobs/6816786093#step:7:772

amotl commented 1 year ago
 FAILED tests/ui/test_restapi.py::test_dwd_summarize - KeyError: 'data'

E       KeyError: 'data'

tests/ui/test_restapi.py:325: KeyError

https://github.com/earthobservations/wetterdienst/actions/runs/3977956192/jobs/6819450829#step:7:643

amotl commented 1 year ago
FAILED tests/example/test_notebook_examples.py::test_wetterdienst_notebook - nbclient.exceptions.CellTimeoutError: A cell timed out while it was being executed, after 60 seconds.
E           nbclient.exceptions.CellTimeoutError: A cell timed out while it was being executed, after 60 seconds.
E           The message was: Cell execution timed out.
E           Here is a preview of the cell contents:
E           -------------------
E           request.interpolate((51.05089, 13.73832)).df
E           -------------------

-- https://github.com/earthobservations/wetterdienst/actions/runs/4193378438/jobs/7270189155#step:7:737

amotl commented 1 year ago
>           raise FailedDownload(f"Download failed for {remote_file}") from ex
E           wetterdienst.exceptions.FailedDownload: Download failed for https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/daily/kl/historical/tageswerte_KL_04703_19510101_20211231_hist.zip

-- https://github.com/earthobservations/wetterdienst/actions/runs/4193378438/jobs/7270648670#step:7:855

amotl commented 1 year ago
/home/runner/work/_temp/4c18a5e3-f6d6-4ee7-9e16-56e742cd904c.sh: line 2:  2613 Segmentation fault      (core dumped) poetry run pytest -vvv -m cflake tests

-- https://github.com/earthobservations/wetterdienst/actions/runs/4198732871/jobs/7282743949#step:7:868

amotl commented 1 year ago
FAILED tests/test_api.py::test_api[True-eaufrance-hubeau-kwargs6-None] - FileNotFoundError: ['https://hubeau.eaufrance.fr/api/v1/hydrometrie/observations_tr?code_entite=A021005050&grandeur_hydro=Q&sort=asc&size=2']

-- https://github.com/earthobservations/wetterdienst/actions/runs/4198752843/jobs/7282789791#step:6:793

amotl commented 1 year ago

test_radar_request_site_recent_sweep_pcp_v_hdf5

        # Verify number of results.
>       assert len(results) >= 12
E       AssertionError: assert 3 >= 12
E        +  where 3 = len([RadarResult(data=<_io.BytesIO object at 0x7f52fc3b5d00>, timestamp=datetime.datetime(2023, 4, 21, 18, 15), url='https://opendata.dwd.de/weather/radar/sites/sweep_pcp_v/boo/hdf5/filter_simple/ras07-stqual-pcpng01_sweeph5onem_vradh_00-2023042118153300-boo-10132-hd5', filename=None), RadarResult(data=<_io.BytesIO object at 0x7f52fc3b7d80>, timestamp=datetime.datetime(2023, 4, 21, 18, 20), url='https://opendata.dwd.de/weather/radar/sites/sweep_pcp_v/boo/hdf5/filter_simple/ras07-stqual-pcpng01_sweeph5onem_vradh_00-2023042118203300-boo-10132-hd5', filename=None), RadarResult(data=<_io.BytesIO object at 0x7f52fc3b5580>, timestamp=datetime.datetime(2023, 4, 21, 18, 25), url='https://opendata.dwd.de/weather/radar/sites/sweep_pcp_v/boo/hdf5/filter_simple/ras07-stqual-pcpng01_sweeph5onem_vradh_00-2023042118253300-boo-10132-hd5', filename=None)])

-- https://github.com/earthobservations/wetterdienst/actions/runs/4767730821/jobs/8476286877?pr=921#step:7:729

amotl commented 1 year ago

Two radar trippings at https://github.com/earthobservations/wetterdienst/actions/runs/4872021487/jobs/8689700227?pr=934#step:7:871, coming from GH-934. I am not merging it, because you could take it as an opportunity to improve/fix those test cases?

FAILED tests/provider/dwd/radar/test_api_historic.py::test_radar_request_composite_historic_radolan_rw_yesterday - AssertionError: assert {'datasize': 1620000, 'datetime': datetime.datetime(2023, 5, 2, 12, 50), 'formatversion': 3, 'intervalseconds': 3600, ...} == IsDict(datasize=1620000, datetime=IsDatetime(approx=datetime.datetime(2023, 5, 2, 12, 55, 27, 908929), delta=datetime.timedelta(seconds=3900)), formatversion=3, intervalseconds=3600, maxrange='150 km', moduleflag=1, ncol=900, nrow=900, precision=0.1, producttype='RW', radarid='10000', radarlocations=IsList(IsStr(regex='asb|boo|drs|eis|ess|fbg|fld|hnr|isn|mem|mhp|neu|nhb|oft|pro|ros|tur|umd'), length=(10, 18)), radolanversion='2.29.1')
FAILED tests/provider/dwd/radar/test_api_historic.py::test_radar_request_composite_historic_radolan_rw_timerange - AssertionError: assert {'datasize': 1620000, 'datetime': datetime.datetime(2023, 5, 2, 12, 50), 'formatversion': 3, 'intervalseconds': 3600, ...} == IsDict(datasize=1620000, datetime=IsDatetime(approx=datetime.datetime(2023, 5, 2, 12, 55, 28, 386179), delta=datetime.timedelta(seconds=3900)), formatversion=3, intervalseconds=3600, maxrange='150 km', moduleflag=1, ncol=900, nrow=900, precision=0.1, producttype='RW', radarid='10000', radarlocations=IsList(IsStr(regex='asb|boo|drs|eis|ess|fbg|fld|hnr|isn|mem|mhp|neu|nhb|oft|pro|ros|tur|umd'), length=(10, 18)), radolanversion='2.29.1')
amotl commented 1 year ago

Problem

The same as https://github.com/earthobservations/wetterdienst/issues/816#issuecomment-1433851799:

/home/runner/work/_temp/fb69aca9-ccc9-4f10-badf-5bee36f2b16e.sh: line 2:  5695 Segmentation fault      (core dumped) poetry run pytest -vvv -m cflake tests
Error: Process completed with exit code 139.

-- https://github.com/earthobservations/wetterdienst/actions/runs/5372248471/jobs/9764017581?pr=965#step:8:767

References

amotl commented 1 year ago

Two other hiccups observed on Windows.

amotl commented 1 year ago
FAILED tests/test_api.py::test_api[True-geosphere-observation-kwargs12-5882] - FileNotFoundError: ['https://dataset.api.hub.zamg.ac.at/v1/station/historical/klima-v1-1d?parameters=nied&start=1774-12-31T00:12&end=2023-10-03T12:10&station_ids=5882&output_format=geojson']
FAILED tests/provider/geosphere/observation/test_api.py::test_geopshere_observation_api - aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host dataset.api.hub.zamg.ac.at:443 ssl:default [Connect call failed ('138.22.189.28', 443)]

-- https://github.com/earthobservations/wetterdienst/actions/runs/6379997483/job/17313561357?pr=1032#step:8:900

amotl commented 1 year ago
[gw1] [ 91%] FAILED tests/ui/test_restapi.py::test_dwd_interpolate 
tests/ui/test_restapi.py::test_dwd_summarize 
[gw1] [ 91%] FAILED tests/ui/test_restapi.py::test_dwd_summarize 
tests/ui/test_restapi.py::test_api_values_missing_null 
 >       assert response.status_code == 200
E       assert 404 == 200
E        +  where 404 = <Response [404 Not Found]>.status_code

tests/ui/test_restapi.py:329: AssertionError

-- https://github.com/earthobservations/wetterdienst/actions/runs/6385928867/job/17331705242?pr=1037#step:8:771

amotl commented 1 year ago
tests/ui/test_cli.py:429:
FAILED tests/ui/test_cli.py::test_cli_interpolate - ValueError: stderr not separately captured
self = <Result BadZipFile("Bad CRC-32 for file 'produkt_klima_tag_19570201_20221231_04300.txt'")>
self = <Result BadZipFile("Bad CRC-32 for file 'produkt_klima_tag_19570201_20221231_04300.txt'")>

-- https://github.com/earthobservations/wetterdienst/actions/runs/6373530809/job/17297323740?pr=1012#step:8:769 -- https://github.com/earthobservations/wetterdienst/actions/runs/6385928867/job/17331705242?pr=1037#step:8:751

amotl commented 1 year ago

After bringing in GH-1041, re-running failed test cases once after a delay of five seconds, the situation should get better. We may need to adjust the corresponding settings to improve further.

amotl commented 1 year ago

Three failures of test_api_values_missing_null in two different PRs, GH-1038 and GH-1040, and when merging GH-1041 into the main branch, apparently only happening on Windows.

>       assert response.status_code == 200
E       assert 400 == 200
E        +  where 400 = <Response [400 Bad Request]>.status_code

-- https://github.com/earthobservations/wetterdienst/actions/runs/6395577981/job/17359553822#step:8:755 -- https://github.com/earthobservations/wetterdienst/actions/runs/6384459069/job/17359576279?pr=1038#step:8:751 -- https://github.com/earthobservations/wetterdienst/actions/runs/6396564510/job/17362614803?pr=1040#step:8:755

amotl commented 1 year ago
FAILED tests/test_api.py::test_api[False-wsv-pegel-kwargs8-None] - FileNotFoundError: ['https://pegelonline.wsv.de/webservices/rest-api/v2/stations.json?includeTimeseries=true&includeCharacteristicValues=true']
FAILED tests/test_api.py::test_api[True-wsv-pegel-kwargs8-None] - FileNotFoundError: ['https://pegelonline.wsv.de/webservices/rest-api/v2/stations.json?includeTimeseries=true&includeCharacteristicValues=true']

-- https://github.com/earthobservations/wetterdienst/actions/runs/6415265077/job/17416906491

amotl commented 1 year ago

Why does it fail? It looks like the pattern b'\x00\x00\x00\x00\x00...BUFR' should match the input data, but it apparently does not.

___ test_radar_request_site_historic_px250_bufr_yesterday ___
        # Verify data.
        header = b"\x00\x00\x00\x00\x00...BUFR"
>       assert re.match(header, payload), payload[:20]
E       AssertionError: b'\x00\x00\x00\x00\x00
E         \x86(BUFR
E         \x86(\x04\x00\x00\x16\x00'
E       assert None
E        +  where None = <function match at 0x7ff25cbc07c0>(b'\x00\x00\x00\x00\x00...BUFR', b'\x00\x00\x00\x00\x00\n\x86(BUFR\n\x86(\x04\x00\x00\x16

-- https://github.com/earthobservations/wetterdienst/actions/runs/6461108869/job/17540121478?pr=1055#step:8:756

gutzbenj commented 1 year ago
FAILED tests/provider/dwd/radar/test_api_historic.py::test_radar_request_radvor_re_timerange - AssertionError: RE110805100001023BY   1620189VS 5SW P300004HPR E-03INT  60GP 900x 900VV 000MF 00000008QN 016MS 91<deasb,deboo,dedrs,deeis,deess,defbg,defld,dehnr,deisn,demem,deneu,denhb,deoft,depro,detur>
assert None
 +  where None = <function match at 0x7fc9a762cd30>('RE......[1000](https://github.com/earthobservations/wetterdienst/actions/runs/6492893684/job/17632737807#step:8:1001)01023BY   162....VS 5SW P30000.HPR E-03INT  60GP 900x 900VV 000MF 00000008QN 016MS...<(deasb,)?(deboo,)?(dedrs,)?(deeis,)?(deess,)?(defbg,)?(defld,)?(dehnr,)?(deisn,)?(demem,)?(demhp,)?(deneu,)?(denhb,)?(deoft,)?(depro,)?(deros,)?(detur,)?(deumd)?>', 'RE110805100001023BY   1620189VS 5SW P300004HPR E-03INT  60GP 900x 900VV 000MF 00000008QN 016MS 91<deasb,deboo,dedrs,deeis,deess,defbg,defld,dehnr,deisn,demem,deneu,denhb,deoft,depro,detur>')
 +    where <function match at 0x7fc9a762cd30> = re.match

-- https://github.com/earthobservations/wetterdienst/actions/runs/6492893684/job/17632737807

gutzbenj commented 1 year ago

Why does it fail? It looks like the pattern b'\x00\x00\x00\x00\x00...BUFR' should match the input data, but it apparently does not.

___ test_radar_request_site_historic_px250_bufr_yesterday ___
        # Verify data.
        header = b"\x00\x00\x00\x00\x00...BUFR"
>       assert re.match(header, payload), payload[:20]
E       AssertionError: b'\x00\x00\x00\x00\x00
E         \x86(BUFR
E         \x86(\x04\x00\x00\x16\x00'
E       assert None
E        +  where None = <function match at 0x7ff25cbc07c0>(b'\x00\x00\x00\x00\x00...BUFR', b'\x00\x00\x00\x00\x00\n\x86(BUFR\n\x86(\x04\x00\x00\x16

-- https://github.com/earthobservations/wetterdienst/actions/runs/6461108869/job/17540121478?pr=1055#step:8:756

not sure, i did try to run the regex, but also had problems with it

amotl commented 11 months ago

It looks like DWD DMO is flaky with ValueError: month must be in 1..12? Can we do something about it so it doesn't fail Dependabot so hard?

It looks like it happened yesterday around that time:

Mon, 01 Jan 2024 19:31:17 GMT
Mon, 01 Jan 2024 19:38:42 GMT
Mon, 01 Jan 2024 19:41:25 GMT
thread '<unnamed>' panicked at py-polars/src/map/series.rs:219:19:
python function failed ValueError: month must be in 1..12
FAILED tests/test_api.py::test_api[False-dwd-dmo-kwargs2-None] - pyo3_runtime.PanicException: python function failed ValueError: month must be in 1..12

-- https://github.com/earthobservations/wetterdienst/actions/runs/7379476075/job/20075779108?pr=1151#step:6:4042

amotl commented 11 months ago

It looks like DWD DMO is flaky with ValueError: month must be in 1..12? Can we do something about it?

You submitted a fix with f42b6d61784e18 already? Thanks!

gutzbenj commented 11 months ago

You submitted a fix with f42b6d6 already? Thanks!

Exactly! I should probably also add a test for that! .... DWD DMO file names are currently only containing month and day, which is pretty ugly to handle, especially around new year change.