NOWUM / open-energy-data-server

OEDS - Open Energy Data Server - Scripts for a reproducible database server for energy informatics data sets
https://monitor.nowum.fh-aachen.de/d/FEg9kde7z/entsoe-monitor?orgId=1
11 stars 2 forks source link

feat: add regelleistung crawler #14

Closed st3113n closed 2 months ago

maurerle commented 3 months ago

Also - how long does it take approximately to crawl the whole dataset? Did you make sure , that the crawler works fine, when running it again after a while (for example today, to download only the data since the last crawl?)

st3113n commented 3 months ago

Also - how long does it take approximately to crawl the whole dataset? Did you make sure , that the crawler works fine, when running it again after a while (for example today, to download only the data since the last crawl?)

I can start the whole crawl once again if the comments are resolved, but at this time I would estimate a few hours to crawl the whole dataset to the 2020-01-01. The crawler first checks what the latest date is per table and then adds all dates from that date to including yesterday. Then it checks what the earliest date in the table is and if the earliest date to write is set to a date before that, then the missing dates are added. This is also done per table. So if the crawler should stop at any point for any reason it just can be started again and only adds the missing entries.

maurerle commented 2 months ago

@st3113n this looks quite ready to merge - could you fix the open discussions please? :)

maurerle commented 2 months ago

I got column "note" of relation "afrr_ergebnisse_regelarbeit" does not exist at some place.

Recovering failed due to log.info(new_data["date_from"]) having a KeyError: 'date_from' as this is delivery_date for regelarbeit..?

st3113n commented 2 months ago

@st3113n this looks quite ready to merge - could you fix the open discussions please? :)

Yeah I will do that, but I also made some changes to the way the last dates are stored that were crawled etc that still need to be tested and I won't have time for that until Friday. So I will deal with that on Friday, if this is fine?:)

maurerle commented 2 months ago

Sure :)

st3113n commented 2 months ago

I implemented now all the changes and gonna crawl the tables one after another.