OSUKED / ElexonDataPortal

Python wrapper for the Elexon/BMRS API
https://osuked.github.io/ElexonDataPortal
MIT License
52 stars 14 forks source link

API call return "The returned `data_content` must be one of: `list` or `OrderedDict`" error #18

Closed ayokariks closed 2 years ago

ayokariks commented 2 years ago

When trying to call one of the BMRS API services using this package i get the following exception:


ValueError Traceback (most recent call last) c:\Users\Admin\OneDrive\Desktop\esda_work\dissertation\bm_forcasting\data_processing.ipynb Cell 7' in 1 #Aggregated Imbalance Volumes ----> 2 df_B1780 = client.get_B1780(start_date, end_date) 3 df_B1780.head(3)

File ~\AppData\Roaming\Python\Python38\site-packages\ElexonDataPortal\api.py:880, in Client.get_B1780(self, start_date, end_date) 867 def get_B1780( 868 self, 869 start_date: str='2020-01-01', 870 end_date: str='2020-01-01 1:30', 871 ): 872 """ 873 Aggregated Imbalance Volumes 874
(...) 877 end_date (str) 878 """ --> 880 df = orchestrator.query_orchestrator( 881 method='get_B1780', 882 api_key=self.api_key, 883 n_attempts=self.n_retry_attempts, 884 request_type='SP_and_date', 885 kwargs_map={'date': 'SettlementDate', 'SP': 'Period'}, 886 func_params=['APIKey', 'date', 'SP', 'ServiceType'], 887 start_date=start_date, 888 end_date=end_date, 889 ) 891 return df

File ~\AppData\Roaming\Python\Python38\site-packages\ElexonDataPortal\dev\orchestrator.py:411, in query_orchestrator(method, api_key, request_type, kwargs_map, func_params, start_date, end_date, n_attempts, kwargs) 408 assert request_type in request_type_to_func.keys(), f"{request_type} must be one of: {', '.join(request_type_to_func.keys())}" 409 request_func = request_type_to_func[request_type] --> 411 df = request_func( 412 method=method, 413 api_key=api_key, 414 n_attempts=n_attempts, 415 kwargs 416 ) 418 df = df.reset_index(drop=True) 420 return df

File ~\AppData\Roaming\Python\Python38\site-packages\ElexonDataPortal\dev\orchestrator.py:79, in SP_and_date_request(method, kwargs_map, func_params, api_key, start_date, end_date, n_attempts, **kwargs) 75 assert len(missing_kwargs) == 0, f"The following kwargs are missing: {', '.join(missing_kwargs)}" 77 r = retry_request(raw, method, kwargs, n_attempts=n_attempts) ---> 79 df_SP = utils.parse_xml_response(r) 80 df = df.append(df_SP) 82 df = utils.expand_cols(df)

File ~\AppData\Roaming\Python\Python38\site-packages\ElexonDataPortal\dev\utils.py:85, in parse_xml_response(r) 83 df = pd.DataFrame(pd.Series(data_content)).T 84 else: ---> 85 raise ValueError('The returned data_content must be one of: list or OrderedDict') 87 return df

ValueError: The returned data_content must be one of: list or OrderedDict


This is the API service that caused this issue:

Aggregated Imbalance Volumes

df_B1780 = client.get_B1780(start_date, end_date) df_B1780.head(3)

ayokariks commented 2 years ago

@AyrtonB @peterdudfield

peterdudfield commented 2 years ago

Thanks for reporting the bug

Could you try a few other dates? Perhaps even at least one day? There's a chance you need to pull at least one day of data - but I'm not certain on this.

AyrtonB commented 2 years ago

Hi @ayokariks,

It should be possible to make the request using less than a day, I've just tried to replicate your code and had no issues. Is there anything different in your inputs?

from ElexonDataPortal import api

client = api.Client()

start_date = '2020-01-01'
end_date = '2020-01-01 1:30'

df_B1780 = client.get_B1780(start_date, end_date)
image
ayokariks commented 2 years ago

Hi Aryton,

I have similar code but still getting the error @AyrtonB

image

Kind regards,

Ayo

peterdudfield commented 2 years ago

Hi Aryton,

I have similar code but still getting the error @AyrtonB

image

Kind regards,

Ayo

What version of this repo are you running? What version of python are you using? Any other runtime information?

ayokariks commented 2 years ago

Hi @peterdudfield the version of the ElexonData portal ~ 2.0.12, python 3.8.2. runtime :B1780: 0%| | 0/3 [00:00<?, ?it/s]. It's weird because I've copied the same code from Ayrton and it still gives an exception for me.

peterdudfield commented 2 years ago

Hi @peterdudfield the version of the ElexonData portal ~ 2.0.12, python 3.8.2. runtime :B1780: 0%| | 0/3 [00:00<?, ?it/s]. It's weird because I've copied the same code from Ayrton and it still gives an exception for me.

Its not a typo in your api_key? Do the other data read functions work?

AyrtonB commented 2 years ago

Hi both,

I just tested with a dodgy API key and the client flagged it with the excepted error (below)

RequestError: 403 - forbidden
An invalid API key has been passed
ayokariks commented 2 years ago

Hi guys,

I shall try again I'll restart my session, but yeah my API key is fine I got it from the BMRS portal for my account @peterdudfield @AyrtonB , other services for me work fine

ayokariks commented 2 years ago

Hi guys @AyrtonB @peterdudfield so I've run it again API key is fine, running on base python 3.8.2 and the latest version of the Elexon portal python wrapper. still hitting the exception in the code.

Looking at the utilies:

def parse_xml_response(r): r_dict = xmltodict.parse(r.text)

status_check_response = check_status(r)
if status_check_response is not None:
    return status_check_response

capping_applied = check_capping(r)

data_content = r_dict['response']['responseBody']['responseList']['item']

if isinstance(data_content, list):
    df = expand_cols(pd.DataFrame(data_content))
elif isinstance(data_content, OrderedDict):
    df = pd.DataFrame(pd.Series(data_content)).T
else:
    raise ValueError('The returned `data_content` must be one of: `list` or `OrderedDict`')

return df

Tested with another BMRS API python wrapper which retrieved the data as expected(format isn't the best haha this is the best of the wrappers around). So it's definitely not a server-side issue

inyutin commented 2 years ago

Get the same error while trying to get temperature data:

from ElexonDataPortal import api

client = api.Client('***')
start_date = '2020-01-01'
end_date = '2020-01-01 1:30'

df_B1610 = client.get_TEMP(start_date, end_date)

print(df_B1610.head(3))

Traceback (most recent call last): File "/home/dmitry/Code/tesseract/MMS-Ingest/scripts/elexon_client_test.py", line 7, in df_B1610 = client.get_TEMP(start_date, end_date) File "/home/dmitry/Code/tesseract/MMS-Ingest/env/lib/python3.10/site-packages/ElexonDataPortal/api.py" , line 1627, in get_TEMP df = orchestrator.query_orchestrator( File "/home/dmitry/Code/tesseract/MMS-Ingest/env/lib/python3.10/site-packages/ElexonDataPortal/dev/orc hestrator.py", line 411, in query_orchestrator df = request_func( File "/home/dmitry/Code/tesseract/MMS-Ingest/env/lib/python3.10/site-packages/ElexonDataPortal/dev/orc hestrator.py", line 175, in date_range_request df = utils.parse_xml_response(r) File "/home/dmitry/Code/tesseract/MMS-Ingest/env/lib/python3.10/site-packages/ElexonDataPortal/dev/uti ls.py", line 85, in parse_xml_response raise ValueError('The returned data_content must be one of: list or OrderedDict') ValueError: The returned data_content must be one of: list or OrderedDict

Investigate the problem now

inyutin commented 2 years ago

Let's take a closer look at parsing function


def parse_xml_response(r):
    r_dict = xmltodict.parse(r.text)

    status_check_response = check_status(r)
    if status_check_response is not None:
        return status_check_response

    capping_applied = check_capping(r)

    data_content = r_dict['response']['responseBody']['responseList']['item']

    if isinstance(data_content, list):
        df = expand_cols(pd.DataFrame(data_content))
    elif isinstance(data_content, OrderedDict):
        df = pd.DataFrame(pd.Series(data_content)).T
    else:
        raise ValueError('The returned `data_content` must be one of: `list` or `OrderedDict`')

    return df

@AyrtonB what's the idea of making it strict to be an OrderedDict? It seems that everything should be fixed by allowing data to be Dict instead of OrderedDict. At least my problem with TEMP data is fixed that way. Moreover OrderedDict is subclass of Dict, so everything should work fine.

With your allow I can make a PR to fix the problem.

AyrtonB commented 2 years ago

Previously a successful response from the xmltodict library returned only lists and OrderedDicts, as of 24 days ago this has changed.

I've edited parse_xml_response to accept dictionaries as well in the latest commit - d9c6cfa. Have tested and managed to replicate the error then remove it with this change. @ayokariks and @inyutin you should be able to use the latest 2.0.14 version with no issues, let me know once you've confirmed this and I'll close the issue.

Thanks all for identifying the bug and narrowing down the cause!

ayokariks commented 2 years ago

Hi @ayokariks @peterdudfield appreciate it guys for building this wrapper actually using it for my disso at UCL. Tried the latest version and it works now thank you

AyrtonB commented 2 years ago

Great its working! Good luck with the dissertation