iati-data-access / iati-flattener

Library to flatten IATI data
GNU Affero General Public License v3.0
1 stars 3 forks source link

Out of date data which has been unpublished may be left accessible for download, if it is only data for region/country #18

Closed simon-20 closed 2 months ago

simon-20 commented 10 months ago

When this library generates the CSV files for the regions/countries, it generates them for all regions/countries, even when there is no data, so we get blank CSV files. But when the XLSX files are generated, they are only generated for regions/countries for which there is currently published data. And it seems that the XLSX files from the previous day are not deleted before they are all regenerated. This doesn't matter in the vast majority of cases, because the regeneration overwrites the file anyway.

But if all of the data for a region or country is deleted/unpublished (which is realistically possible for those regions and countries where there is only data from a single activity), then the existing XLSX file would not be deleted nor regenerated, so we would end up with out of date data being accessible.

Task: check for certain that this could happen, and fix.

Code which writes the Excel files to disk is group_data.py / group_results().

Relevant to think about: is it best to output an Excel file for every region/country regardless, meaning that we end up with some empty Excel files; or is it best to avoid outputting empty Excel files.