Open maxheld83 opened 4 years ago
also opens up #251 and makes #240 much easier
I think it'd be really great to get the ESAC registry data in a programmatic way, ideally without scraping, since the data surely must exist in some database.
This would open a bunch of interesting applications for us (see esac
label).
@njahn82 @Henrieke72:
and @njahn82 can you comment how strategically important the ESAC registry data is for our project?
I really want to leverage the work that @Henrieke72 did with it already, and it seems to me the opportunities to mash up the ESAC data with the rest of hoad could be quite interesting #251, but I might not have enough context.
Considering that the data is already mostly structured (and even tidy), properly cleaning and exposing it shouldn't be too much work, maybe a day or two. Depending on what ESAC wants to do with their data, we can also wrap it up in a small R package that's separate from hoad, so more people can use it.
so this will be scraped in a separate package
@maxheld83 Unfortunately, there is only the HTML version of the data, this is why I had to copy and paste it into an Excel sheet. As the registry data are very dynamic, maybe there is a way to automatically update the Excel file with the new data?
Thanks @Henrieke72! I'll do that; I'll scrape the data off the website and then offer an excel export.
the comprehensive data in
ESAC_Transformative_Agreement_Übersicht_der_Verträge.xlsx
is so far entered by hand from the esac website.Perhaps there might be away to scrape this off the website programmatically and/or ask esac for the data in structured form.
Not sure how central this is to our mission though.