Leading and trailing whitespace exists in RDF/XML values as output by cdm-2-rdf.xsl.
Examples include values for the following properties:
dct:description
dpla:providedLabel
One significant reason for this is that in order to split all LC headings, the XSL transform currently tokenizes ';'. Previously, '; ' was tokenized, but this missed some headings that were separated by only a semicolon, a semicolon followed by a line break, etc.
To fix
It seems that this could be easily fixed by using OpenRefine to eliminate leading and trailing whitespace in text values?
Leading and trailing whitespace exists in RDF/XML values as output by cdm-2-rdf.xsl.
Examples include values for the following properties:
One significant reason for this is that in order to split all LC headings, the XSL transform currently tokenizes
';'
. Previously,'; '
was tokenized, but this missed some headings that were separated by only a semicolon, a semicolon followed by a line break, etc.To fix
It seems that this could be easily fixed by using OpenRefine to eliminate leading and trailing whitespace in text values?