OpenEnergyPlatform / open-MaStR

A collaborative software to download the energy database Marktstammdatenregister (MaStR)
https://open-mastr.readthedocs.io/en/latest/
GNU Affero General Public License v3.0
83 stars 17 forks source link

"_extended" tabels empty for selected bulk download data #547

Closed clwehner closed 1 month ago

clwehner commented 1 month ago

Description of the issue

First, thank you for developing this package. I have been using the latest 0.14.4 version of open_mastr in combination with a self hosted postgresql. When I do a bulk download to dump the latest data into my db, everything works fine for the wind_eeg and solar_eeg table. The wind_extended and solar_extended table on the other hand remain empty.

Steps to Reproduce

  1. init mastr
    db = open_mastr.Mastr(engine=my_psql_engine)
  2. start bulk download
    db.download(
        method=params["mastr_method"], # "bulk"
        data=params["mastr_data"], # ["wind","solar"]
        date=params["mastr_date"], # "today"
    )

Ideas of solution

I looked into the donwloaded Gesamtdatenexport_20240715.zip file and found the following .xml files. I reduced the numbered ones to save space. Going through your code, I missed EinheitenSolar.xml and EinheitenWind.xml in my extraced Gesamtdatenexport_20240715.zip.

Archive:  Gesamtdatenexport_20240715.zip
  inflating: mastr/AnlagenEegBiomasse.xml  
  inflating: mastr/AnlagenEegGeothermieGrubengasDruckentspannung.xml  
  inflating: mastr/AnlagenEegSolar_1.xml  
...
  inflating: mastr/AnlagenEegSpeicher_1.xml  
...
  inflating: mastr/AnlagenEegWasser.xml  
  inflating: mastr/AnlagenEegWind.xml  
  inflating: mastr/AnlagenGasSpeicher.xml  
  inflating: mastr/AnlagenKwk.xml    
  inflating: mastr/AnlagenStromSpeicher_1.xml  
...
  inflating: mastr/Bilanzierungsgebiete.xml  
  inflating: mastr/EinheitenAenderungNetzbetreiberzuordnungen.xml  
  inflating: mastr/EinheitenGasErzeuger.xml  
  inflating: mastr/EinheitenGasSpeicher.xml  
  inflating: mastr/EinheitenGasverbraucher.xml  
  inflating: mastr/EinheitenGenehmigung.xml  
  inflating: mastr/EinheitenStromVerbraucher.xml  
  inflating: mastr/Einheitentypen.xml  
  inflating: mastr/Ertuechtigungen.xml  
  inflating: mastr/GeloeschteUndDeaktivierteEinheiten_1.xml  
  inflating: mastr/GeloeschteUndDeaktivierteEinheiten_2.xml  
  inflating: mastr/Katalogkategorien.xml  
  inflating: mastr/Katalogwerte.xml  
  inflating: mastr/Lokationen_1.xml  
...
  inflating: mastr/Marktakteure_1.xml  
...
  inflating: mastr/Marktrollen.xml   
  inflating: mastr/Netzanschlusspunkte_1.xml  
...
  inflating: mastr/Netze.xml

Any idea on what might went wrong?

Context and Environment

Workflow checklist

FlorianK13 commented 1 month ago

Hi @clwehner and thanks for your issue. open-mastr should not change anything in the zipped folder. If .xml files are missing there, this is probably due to an error from BNetzA. The bulk download file changes every day, so maybe the missing files appear again tomorrow. You can always check the raw data source at https://www.marktstammdatenregister.de/MaStR/Datendownload

FlorianK13 commented 1 month ago

Also when checking https://www.marktstammdatenregister.de/MaStR/Datendownload I saw that the data from yesterday is roughly 1GB, whereas the data from 01.07.2024 is ~2GB so I assume the wind and solar tables are missing today.

clwehner commented 1 month ago

Thanks for taking your time and looking into it. Found the same after writing this issue and this morning.. I am going to try and reach out to the BNetzA and ask them about the difference. Maybe they decided to only supply the entire dataset once every quarter.

If proven true, maybe another date option for Mastr.download() for the "latest complete dataset" might be nice. For now I downloaded the latest and biggest one manually and tried to read it with open_mastr by passing Mastr.download(date="20240701") like written in the docstring. Unfortunately, I found that you had a =!"today" check for the bulk download, which i had to overwrite to make it work. In case implementing the "latest complete dataset" option seams reasonable, I can also help. :)

FlorianK13 commented 1 month ago

Could you tell me their response when you get one? However I'm quite sure that this is not done intentionally and that future datasets will have the extended tables again.

pl52feve commented 1 month ago

Hi, so, I didn't get a reply from BNetzA, but it seems that all tables are back in the download today.

FlorianK13 commented 1 month ago

Based on the comment of @pl52feve this issue seems to be solved, hence I will close it.