soilwise-he / harvesters

MIT License
0 stars 0 forks source link

harvest from esdac repository #10

Closed pvgenuchten closed 1 month ago

pvgenuchten commented 5 months ago

esdac is a drupal cms instance, it doesn't (afaik) offer a api endpoint to query the datasetregistrations. It provides a basic title,abstract,year listing though.

To harvest the esdac datasets, i used a combination of the above xsl + webpage scraping. This was a manual exersize. Better to negociate with JRC if they can extend the xsl listing or provide alternative harvesting options (such as direct database access)

pvgenuchten commented 3 months ago

a set has been derived based on html scraping, needs improvement (query drupal db?)

pvgenuchten commented 1 month ago

duplicates #5