bundesAPI / smard-api

https://smard.api.bund.dev
59 stars 8 forks source link

csv download from smard.de via python script fails #9

Closed ThomasRegier closed 1 year ago

ThomasRegier commented 2 years ago

I just posted in issue #7 how the example in /smard-api)/python-client/ fails. But why even need a special library if the use case is quit usual.

I simply want to download the file I may get with the download button on the following smard.de-link:

https://www.smard.de/home/downloadcenter/download-marktdaten#!?downloadAttributes=%7B%22selectedCategory%22:1,%22selectedSubCategory%22:1,%22selectedRegion%22:%22DE%22,%22from%22:1658872800000,%22to%22:1659563999999,%22selectedFileType%22:%22CSV%22%7D

Based on the site traffic I found out, that the following url and payload are used:

`url = 'https://www.smard.de/nip-download-manager/nip/download/market-data'`
payload = {'format': 'CSV',
'language': 'de',
'moduleIds': [1001224, 1004066, 1004067, 1004068, 1001223, 1004069, 1004071, 1004070, 1001226, 1001228, 1001227,1001225],
'region': 'DE',
'timestamp_from': 1659304800000,
'timestamp_to': 1659391199999,
'type': 'discrete'}

Usually I should be able to perform this request:

`x = requests.post(url, json = json.dumps(payload))`

But I receive the following error:

`The requested URL was rejected.`

How may this be solved?

lukaspanni commented 1 year ago

The error shows that the site is blocking your request. I guess they are using a cookie (or a combination of multiple cookies) to block direct access to the download endpoint. The download page sets these four cookies: grafik They're probably used to identify valid requests. A workaround could be to send a get request to the aforementioned download page to get the cookies to be used in the post request.

ThomasRegier commented 1 year ago

@lukaspanni thank you for your help.

I already found the error and do already built my data pipeline and analysis. I did not copy the complete payload. Here is a complete description of the issue and the solution: https://stackoverflow.com/questions/73219857/perform-download-via-download-button-when-request-url-in-browser-inspect-does-no/73743659#73743659

Btw: The data quality does seem to be not that great. Especially if I want to build a check comparing electricity production with consumption, storage use and export I do get errors of up to more than 3 gwh per hour.