Open dc-almeida opened 4 days ago
First question: is the performance difference between an xlsx spreadsheet versus the yaml files noticeable?
Second question: if Eurostat again changes the file (hosted on the same url), we will only be notified that the hashes don't match, without any guidance on the actual change... Having the xlsx-to-yaml utility can be a useful way to find the change and/or check whether the relevant for us.
First question: is the performance difference between an xlsx spreadsheet versus the yaml files noticeable?
Technically, yaml should be faster than xlsx. However, in practice, reading data is orders of magnitude slower so the difference does not matter.
Second question: if Eurostat again changes the file (hosted on the same url), we will only be notified that the hashes don't match, without any guidance on the actual change... Having the xlsx-to-yaml utility can be a useful way to find the change and/or check whether the relevant for us.
Ok, if Eurostat does not provide a changelog then I can see the value of comparing yaml files. We should still be careful though that the Excel file and the yaml files match. Otherwise we might get a wrong warning.
Ok, if Eurostat does not provide a changelog then I can see the value of comparing yaml files. We should still be careful though that the Excel file and the yaml files match. Otherwise we might get a wrong warning.
YAML files are also comparable in GitHub changelogs, whereas XLSX files are not, so it allows tracking changes to the regions.
Closes #38 Adds a weekly GH action to compare the hash of a local NUTS file with the EUROSTAT website version Adds a XLSX to YAML utility function for use when updating the local file to latest version