jonphipps / Metadata-Registry

http://metadataregistry.org
GNU Affero General Public License v3.0
8 stars 3 forks source link

Import truncated at the first embedded apostrophe or double-quote #20

Open jonphipps opened 8 years ago

jonphipps commented 8 years ago

At some point, it looks as though the import of Toolkit definitions was truncated at the first embedded apostrophe (single quote?) or double-quote (but URL encoded!). The French translators found around 3-4 instances, although the situation is rare. The February Toolkit upload will test to see if this is actually happening during the import stage.

jonphipps commented 8 years ago

Single quotes (' #0027) should be ok, but still should be converted to the correct utf-8 characters for left and right single quotation marks (‘ #8216 ’ #8217) unless used as a 'foot' mark. Double quotes (" #0022) need to be either escaped with a backslash (\"), rather than encoded, for use as 'inch' marks, or converted to the correct utf-8 characters for left and right quotation marks (“ #8220 ” #8221).

The OMR shouldn't fail silently, as it so often does, when it encounters a row containing a problem like this, but rather deliver a helpful error message.