JohnPaton / airbase

🌬 An easy downloader for the AirBase air quality data.
https://airbase.readthedocs.io
MIT License
8 stars 4 forks source link

data retrieval fails #57

Closed charlienegri closed 2 weeks ago

charlienegri commented 3 weeks ago

the following command airbase download --path [my/output/path] --year 2024 -p SO2 -p PM10 -p O3 -p NO2 -p CO -p NO -p PM2.5 consistently fails since the 20th of August with a runtime error: tasks are getting either a 301 or a 401

ClientResponseError: 401, message='', 
url=URL('https://fme.discomap.eea.europa.eu/fmedatastreaming/AirQualityDownload/AQData_Extract.fmw?CountryCode=HR&CityName=&Pollutant=38&Year_from=2024&Year_to=2024&Station=&Sampling
point=&Source=All&Output=TEXT&UpdateDate=')
Task exception was never retrieved
future: <Task finished name='Task-9' coro=<fetcher.<locals>.fetch() done, defined at /modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/airbase/fetch.py:122> exception=ClientResponseError(RequestInfo(url=URL('https://fme.discomap.eea.europa.eu/fmedatastreaming/AirQualityDownload/AQData_Extract.fmw?CountryCode=XK&CityName=&Pollutant=10&Year_from=2024&Year_to=2024&Station=&Samplingpoint=&Source=All&Output=TEXT&UpdateDate='), method='GET', headers=<CIMultiDictProxy('Host': 'fme.discomap.eea.europa.eu', 'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'User-Agent': 'Python/3.10 aiohttp/3.9.5')>, real_url=URL('https://fme.discomap.eea.europa.eu/fmedatastreaming/AirQualityDownload/AQData_Extract.fmw?CountryCode=XK&CityName=&Pollutant=10&Year_from=2024&Year_to=2024&Station=&Samplingpoint=&Source=All&Output=TEXT&UpdateDate=')), (<ClientResponse(http://fme.discomap.eea.europa.eu/fmedatastreaming/AirQualityDownload/AQData_Extract.fmw?CountryCode=XK&CityName=&Pollutant=10&Year_from=2024&Year_to=2024&Station=&Samplingpoint=&Source=All&Output=TEXT&UpdateDate=) [301 Moved Permanently]>
<CIMultiDictProxy('Server': 'nginx', 'Date': 'Mon, 26 Aug 2024 04:52:48 GMT', 'Content-Type': 'text/html', 'Content-Length': '162', 'Connection': 'keep-alive', 'Location': 'https://fme.discomap.eea.europa.eu/fmedatastreaming/AirQualityDownload/AQData_Extract.fmw?CountryCode=XK&CityName=&Pollutant=10&Year_from=2024&Year_to=2024&Station=&Samplingpoint=&Source=All&Output=TEXT&UpdateDate=')>
,), status=401, headers=<CIMultiDictProxy('Server': 'nginx', 'Date': 'Mon, 26 Aug 2024 04:52:48 GMT', 'Content-Type': 'text/html;charset=utf-8', 'Content-Length': '437', 'Connection': 'keep-alive', 'WWW-Authenticate': 'Basic realm="FME Server Authentication"', 'Content-Language': 'en', 'Strict-Transport-Security': 'max-age=15780000; includeSubDomains')>)>
Traceback (most recent call last):
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/airbase/fetch.py", line 126, in fetch
    r.raise_for_status()
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/aiohttp/client_reqrep.py", line 1070, in raise_for_status
    raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 401, message='', url=URL('https://fme.discomap.eea.europa.eu/fmedatastreaming/AirQualityDownload/AQData_Extract.fmw?CountryCode=XK&CityName=&Pollutant=10&Year_from=2024&Year_to=2024&Station=&Samplingpoint=&Source=All&Output=TEXT&UpdateDate=')
Task exception was never retrieved
future: <Task finished name='Task-13' coro=<fetcher.<locals>.fetch() done, defined at /modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/airbase/fetch.py:122> exception=RuntimeError('Session is closed')>
Traceback (most recent call last):
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/airbase/fetch.py", line 125, in fetch
    async with session.get(url, ssl=False) as r:
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/aiohttp/client.py", line 1197, in __aenter__
    self._resp = await self._coro
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/aiohttp/client.py", line 428, in _request
    raise RuntimeError("Session is closed")
RuntimeError: Session is closed
Task exception was never retrieved
future: <Task finished name='Task-12' coro=<fetcher.<locals>.fetch() done, defined at /modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/airbase/fetch.py:122> exception=ServerDisconnectedError('Server disconnected')>
Traceback (most recent call last):
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/airbase/fetch.py", line 125, in fetch
    async with session.get(url, ssl=False) as r:
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/aiohttp/client.py", line 1197, in __aenter__
    self._resp = await self._coro
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/aiohttp/client.py", line 608, in _request
    await resp.start(conn)
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/aiohttp/client_reqrep.py", line 976, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/aiohttp/streams.py", line 640, in read
    await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected
Task exception was never retrieved
future: <Task finished name='Task-14' coro=<fetcher.<locals>.fetch() done, defined at /modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/airbase/fetch.py:122> exception=RuntimeError('Session is closed')>
Traceback (most recent call last):
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/airbase/fetch.py", line 125, in fetch
    async with session.get(url, ssl=False) as r:
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/aiohttp/client.py", line 1197, in __aenter__
    self._resp = await self._coro
  File "/modules/rhel8/user-apps/fou-modules/airbase/0.8.0/venv/lib/python3.10/site-packages/aiohttp/client.py", line 428, in _request
    raise RuntimeError("Session is closed")
RuntimeError: Session is closed
charlienegri commented 3 weeks ago

I realized thanks to @avaldebe that this is failing because the endpoint https://discomap.eea.europa.eu/map/fme/AirQualityExport.htm appears to be requiring a login with password at the moment? I will write to EEA to enquire about it Screenshot from 2024-08-26 11-06-25

JohnPaton commented 3 weeks ago

I've noticed this too, do report back what they say. In the meantime we are working on updating this client to start downloading from the new download service instead.

charlienegri commented 2 weeks ago

they have updated the info and apparently they have introduced a token... --> https://discomap.eea.europa.eu/map/fme/AirQualityExport.htm Screenshot from 2024-08-30 13-48-16 🤦‍♂️

JohnPaton commented 2 weeks ago

Looks like the token is static. Interesting choice, but should hopefully make it easy.

Ideally we won't have to deal with this once we're switched over to the new Service - I'll ping back here once that's released

avaldebe commented 2 weeks ago

I'll prepare a short PR so we can get the tool working until until the end of the year. That should give us more time to get the Parquet related PRs (e.g. #54) in good shape.

charlienegri commented 2 weeks ago

I asked EEA support how to get a machine downloadable metadata file from https://discomap.eea.europa.eu/App/AQViewer/index.html?fqn=Airquality_Dissem.b2g.measurements, I'll keep you posted on what they answer