An assumption that I made going in is that Socrata blobs are zipfiles provided as a link with the form https://data.cityofnewyork.us/download/ft4n-yqee/application%2Fzip (the particularity being the last part: application%2Fzip.
This assumption is incorrect: see for example this dataset, which externalizes as a %2Fvnd.ms-excel (XLSX once downloaded).
The dataset writer (socrata_reducer.write_dataset_representation) is capable of dealing with these filetypes, but it is given the wrong link by the resource writer (socrata_reducer.write_resource_representation). The latter ought to provide a listing of the format:
Fixing this requires examining the endpoint page directly to extract the download URL. This is something that needs to be done for links anyway, at which point it can be extended to blobs as well.
An assumption that I made going in is that Socrata blobs are zipfiles provided as a link with the form
https://data.cityofnewyork.us/download/ft4n-yqee/application%2Fzip
(the particularity being the last part:application%2Fzip
.This assumption is incorrect: see for example this dataset, which externalizes as a
%2Fvnd.ms-excel
(XLSX
once downloaded).The dataset writer (
socrata_reducer.write_dataset_representation
) is capable of dealing with these filetypes, but it is given the wrong link by the resource writer (socrata_reducer.write_resource_representation
). The latter ought to provide a listing of the format:Instead it provides it as a listing of the format:
Fixing this requires examining the endpoint page directly to extract the download URL. This is something that needs to be done for links anyway, at which point it can be extended to blobs as well.