DemandRegioTeam / disaggregator

A set of tools for processing of spatial and temporal disaggregations.
GNU General Public License v3.0
29 stars 16 forks source link

Bug: disaggregator not working with new ffe opendata portal due to changed API #20

Open nesnoj opened 3 months ago

nesnoj commented 3 months ago

After the relaunch of the FfE open data portal I tried to get the disaggregator working:

tl;dr The interface must be adapted, the disaggregator is useless. As of now the only way of using the data is to use the python way described in the how-to.

Details: I updated the remote URL and had to restict ruamel.yaml to <0.18.0 to get it installed and running (cf. forked version).

However, the API seems to be incompatible. E.g. when running

from disaggregator import spatial
spatial.disagg_CTS_industry(sector='industry', source='gas', use_nuts3code=True, year=2022).sum().sum()

I get

2024-05-21 17:56:17 disaggregator.config: INFO     Querying from:
https://api.opendata.ffe.de/demandregio_spatial?id_spatial=eq.71&&year=eq.2022&&value=gt.0.0
2024-05-21 17:56:19 disaggregator.config: ERROR    statusCode                   404
message       Resource not found
dtype: object

Apparently, the new API has a different format (cf. how-to), e.g. the former operators such as eq. are missing. So I removed the prefixes in data.py but without success, still getting

2024-05-21 17:43:02 disaggregator.config: INFO     Querying from:
https://api.opendata.ffe.de/demandregio_spatial?id_spatial=71&&year=2022
2024-05-21 17:43:04 disaggregator.config: ERROR    statusCode                   404
message       Resource not found
dtype: object
jferstl commented 3 months ago

Hello @nesnoj, you're right: due to IT security reasons we had to make some drastic changes to our opendata plattform which included a complete re-design of the former way we stored and retrieved the data. We have not been able to re-build the former database with all its functionality (e.g. filtering with operators like gt / ge etc.).

Not knowing the internals of the dissaggregator tool I cannot really tell how many adaptions you need to implement to make the tool fully functional again, but you figured out the most important ones already (i.e. different URL, unsupported operands in the query parameters).

Now regarding your specific problem: I just checked and the year 2022 is not available for id_spatial 71 in the dataset. Are you sure this existed at some point or did you just choose these parameters randomly?

Also, the URL is not quite correct as you need to use https://api.opendata.ffe.de/demandregio/demandregio_spatial?id_spatial=71&year=2016 instead of https://api.opendata.ffe.de/demandregio_spatial?id_spatial=71&&year=2022.

You can refer to the new API documentation for further details.

nesnoj commented 3 months ago

Thx for your quick reply @jferstl! And thanks for the URL, now some queries do work to some extend again. E.g.

spatial.disagg_households_power(
    by="households",
    weight_by_income=True,
    year=2022,
    scale_by_pop=True,
)

builds query https://api.opendata.ffe.de/demandregio/demandregio_spatial?id_spatial=13 (after chucking away), returns data but crashes

2024-05-22 08:51:56 disaggregator.config: ERROR    id_spatial                                                     13
title                          Stromverbrauch nach Haushaltsgröße
oep_metadata    {'name': 'id_spatial=13', 'title': 'Energy con...
ffe_metadata                              {'id_region_type': [3]}
data            [{'id_spatial': 13, 'id_region_type': 3, 'id_r...
dtype: object
Traceback (most recent call last):
  File "/some/path/disaggregator/disaggregator/config.py", line 127, in database_raw
    df = pd.read_json(BytesIO(requests.get(host + query).content),
...

The query in the first post was an example which was generated by the disaggregator and worked before, it's still not working with the fixed URL though.

There're multiple bugs (see other issues) we fixed in a fork but I don't think we will rework the API.

YifeiLu commented 3 months ago

Hello, I also used the DemandRegio disaggregator for my work. I want to generate some new data right now, but as already mentioned in this issue, the disaggregator does not work anymore because of the change in the API and the data structure. I have managed to get around some of the problems using local files, but there are still many datasets for which I can't find the exact replacement. For example, in the data.database_get() function

    if dimension in ['spatial', 'temporal']:
        id_name = 'id_' + dimension
        if dimension == 'spatial':
            if cfg['use_nuts_2016'] and table_id in cfg['nuts3_tables']:
                table = 'v_demandregio_spatial_lk401'
                # table = 'demandregio_spatial'
            else:
                table = 'demandregio_spatial'
        else:
            table = 'demandregio_temporal'

I can not find the v_demandregio_spatial_lk401' dataset so far. This means I can only access data underhttps://api.opendata.ffe.de/demandregio/demandregio_spatial`, which doesn't give me the same result as what is shown in the example notebooks.