energy-data / energydata.info

energydata.info - open data and analytics for a sustainable energy future
http://energydata.info
MIT License
26 stars 6 forks source link

the resource ids on the energy platform are changing #291

Closed hy347mc closed 5 years ago

hy347mc commented 5 years ago

On the topic of resources DDH noticed another issue, it seems that the resource ids on the energy platform are changing. This is an issue because on updating datasets, DDH's harvester uses the resource ids to classify new and old resources. If the harvester is updating an outdated DDH dataset, and the outdated dataset’s resource ids aren’t present in the newer EEX dataset version, then the harvester will delete the resources on the outdated DDH dataset and replace them with the resources from the EEX dataset. I wouldn’t bring this up if only a handful of resources that were being replaced on updates, but I noticed that in the Pakistan – Biomass Mapping and Pakistan – Wind Measurement Data all the resources ids were different, hence being replaced.

On further analysis DDH noticed that the new resource ids are from the url_type, which is a datacatalog url, would you happen to know why this is?

DDH ran a local script to compare all the resources they have harvested from the ENERGYDATA.INFO to the resources currently in the portal. Please kindly find the spreadsheet as a reference.

eex_ddh_inventory_for_derilinx.xlsx

ss-bhat commented 5 years ago

@hy347mc

Yeah, those are from the ddh harvester - 36 datasets and their corresponding resources. We have fixed most of the resources and validating on if any of the resources are still broken.

jodiegardiner commented 5 years ago

We have fixed this issue. Previously, we simply ran every dataset which came back on the sources list from DDH through the harvester but it appears ~36 datasets should not have been in that list. I will send an email to Alp (and cc stakeholders) with this information so he can make the change at the DDH end. At our end, we now validate more effectively which datasets originated from EEX and which from DDH.