Closed Spksh closed 5 years ago
We've got 4 or 5 datasets during this now.
We now follow redirects to new datasets, but we won't download the new dataset if the file name has not changed.
@natdudley We'll need to run through a few examples to check what should be happening. I may be imagining things, but I think at least a few datasets have the same file name.
Response for 'https://figure.nz/table/LULoKvlIlvkpKyLa/download' failed with 'NotFound: NOT FOUND', checking dataset for 302 Redirect
Response for 'https://figure.nz/table/LULoKvlIlvkpKyLa' succeeded with 'Redirect: FOUND https://figure.nz/table/SMdFo3ocObcD7W6Q'
Found 'Benefits_People_receiving_benefits_by_main_type_and_territorial_authority_2019_Q2.csv'
Response for 'https://figure.nz/table/a5X7TSwPRjKASUSv/download' failed with 'NotFound: NOT FOUND', checking dataset for 302 Redirect
Response for 'https://figure.nz/table/a5X7TSwPRjKASUSv' succeeded with 'Redirect: FOUND https://figure.nz/table/Plydk0XvoJ502h8w'
Found 'Benefits_People_receiving_benefits_by_characteristic_and_territorial_authority_2019_Q2.csv'
Response for 'https://figure.nz/table/mJfUOl7H7J6IlLUe/download' failed with 'NotFound: NOT FOUND', checking dataset for 302 Redirect
Response for 'https://figure.nz/table/mJfUOl7H7J6IlLUe' succeeded with 'Redirect: FOUND https://figure.nz/table/fK7iZAOJhHHirYPk'
Found 'Dogs_Dog_control_statistics_20012019.csv'
Pretty sure one number would have changed but most of title same, so let's close for now, and reopen if it causes issues.
Dataset download links (i.e. the /download endpoint for each dataset) returns a 404 if the dataset has been superceded. The dataset link (i.e. the URL without /download at the end) returns an expected 30x redirect.
There's an open issue with Figure.NZ to return a 30x redirect for the /download links, but that doesn't help us right now.
Need to treat 404s as a potential redirect, strip the /download off the URL and check for a 30x response.