Closed Robsteranium closed 1 year ago
ook.etl is choking on the latest data from beta because it includes URIs like:
ook.etl
http://gss-data.org.uk/data/climate-change/beis-2020-uk-greenhouse-gas-emissions-final-figures-dataset-of-emissions-by-source/2020-uk-greenhouse-gas-emissions-final-figures-dataset-of-emissions-by-source.csv#obs/CH4,CH4,3B14,2001,agriculture,wastes,horses-wastes,managed-manure,other-emissions,other-emissions,agricultural-horses@emissions-ar4-gwps
I'm not exactly what's causing it but I suspect the comma , and/ or at-signs @ are to blame.
,
@
The naive string munging in ook.etl/insert-values-clause yields broken queries like:
ook.etl/insert-values-clause
VALUES ?observation { <"http://gss-data.org.uk/data/climate-change/beis-2020-uk-greenhouse-gas-emissions-final-figures-dataset-of-emissions-by-source/2020-uk-greenhouse-gas-emissions-final-figures-dataset-of-emissions-by-source.csv#obs/C2F6,PFCs,2B9b3,1990,industrial-processes,not-applicable,halocarbon-production,halocarbons-production-fugitive,other-emissions,other-emissions,non-fuel-combustion@emissions-ar4-gwps"> <"http://gss-data.org.uk/data/climate-change/beis-2020-uk-greenhouse-gas-emissions-final-figures-dataset-of-emissions-by-source/2020-uk-greenhouse-gas-emissions-final-figures-dataset-of-emissions-by-source.csv#obs/C2F6,PFCs,2B9b3,1991,industrial-processes,not-applicable,halocarbon-production,halocarbons-production-fugitive,other-emissions,other-emissions,non-fuel-combustion@emissions-ar4-gwps"> ... }
NB: the URI is escaped with double-quotes.
We need to revise the approach. It's hopefully trivial to fix but if not we might want to reach for a proper library - I'll bet @andrewmcveigh's sparqler is a bit more robust!
It wasn't actually string interpolation but parsing SPARQL results. @callum-oakley has resolved this in #124.
ook.etl
is choking on the latest data from beta because it includes URIs like:I'm not exactly what's causing it but I suspect the comma
,
and/ or at-signs@
are to blame.The naive string munging in
ook.etl/insert-values-clause
yields broken queries like:NB: the URI is escaped with double-quotes.
We need to revise the approach. It's hopefully trivial to fix but if not we might want to reach for a proper library - I'll bet @andrewmcveigh's sparqler is a bit more robust!