IATI / refresher

A Python application which has the responsibility of tracking IATI data from around the Web and refreshing the core IATI software's data stores
GNU Affero General Public License v3.0
2 stars 0 forks source link

log download error for undetectable charset, rather than downloading #72

Closed nosvalds closed 2 years ago

nosvalds commented 2 years ago

Trello

Previously, if we couldn't detect a charset for a file we would still download it into the unified platform, which caused files like PDFs to be present in the Unified Platform. Now, if the result of trying to detect a charset of a file is 'None' we log a download error into the Unified Platform DB, and it's excluded from further checks (e.g. validation)

Note that we are seperately pursuing an enhancement to the Registry that would only allow xml files to be included in the registry.