WikiNewZealand / fundamental-figures

0 stars 0 forks source link

Dataset download links return 404 if dataset has been superceded #18

Closed Spksh closed 5 years ago

Spksh commented 5 years ago

Dataset download links (i.e. the /download endpoint for each dataset) returns a 404 if the dataset has been superceded. The dataset link (i.e. the URL without /download at the end) returns an expected 30x redirect.

There's an open issue with Figure.NZ to return a 30x redirect for the /download links, but that doesn't help us right now.

Need to treat 404s as a potential redirect, strip the /download off the URL and check for a 30x response.

Spksh commented 5 years ago

We've got 4 or 5 datasets during this now.

Spksh commented 5 years ago

We now follow redirects to new datasets, but we won't download the new dataset if the file name has not changed.

@natdudley We'll need to run through a few examples to check what should be happening. I may be imagining things, but I think at least a few datasets have the same file name.

Response for 'https://figure.nz/table/LULoKvlIlvkpKyLa/download' failed with 'NotFound: NOT FOUND', checking dataset for 302 Redirect
Response for 'https://figure.nz/table/LULoKvlIlvkpKyLa' succeeded with 'Redirect: FOUND https://figure.nz/table/SMdFo3ocObcD7W6Q'
Found 'Benefits_People_receiving_benefits_by_main_type_and_territorial_authority_2019_Q2.csv'
Response for 'https://figure.nz/table/a5X7TSwPRjKASUSv/download' failed with 'NotFound: NOT FOUND', checking dataset for 302 Redirect
Response for 'https://figure.nz/table/a5X7TSwPRjKASUSv' succeeded with 'Redirect: FOUND https://figure.nz/table/Plydk0XvoJ502h8w'
Found 'Benefits_People_receiving_benefits_by_characteristic_and_territorial_authority_2019_Q2.csv'
Response for 'https://figure.nz/table/mJfUOl7H7J6IlLUe/download' failed with 'NotFound: NOT FOUND', checking dataset for 302 Redirect
Response for 'https://figure.nz/table/mJfUOl7H7J6IlLUe' succeeded with 'Redirect: FOUND https://figure.nz/table/fK7iZAOJhHHirYPk'
Found 'Dogs_Dog_control_statistics_20012019.csv'
natdudley commented 5 years ago

Pretty sure one number would have changed but most of title same, so let's close for now, and reopen if it causes issues.