piratical / Madeline_2.0_PDE

The Madeline 2.0 Pedigree Drawing Engine (PDE) is a pedigree drawing program designed to handle large and complex pedigrees with an emphasis on readability and aesthetics. The program was designed primarily for human pedigrees.
GNU General Public License v2.0
19 stars 18 forks source link

Typographers' quotes in madeline feedback #34

Closed formerPolaris closed 8 years ago

formerPolaris commented 8 years ago

Hey there,

I develop an internal web app for a genetic diagnostics company that uses your engine to draw pedigrees from our patient data. It's worked fantastically well so far.

However, a recent release has changed the feedback Madeline displays (and that my server parses to find the file Madeline creates) from using standard double quotes to typographers' quotes. If there is not a better way to locate the file Madeline creates, then is it possible to change this back, or at least know if there will be consistency in this particular line going forward?

Thanks!

piratical commented 8 years ago

Hi, Vardarac,

I have no plans to reverse this change so to answer your question: yes, going forward we will continue to use typographical quotes for English-language messages where unicode code point U+201C (“) is the opening quotation mark and U+201D (”) is the closing quotation mark. English language messages are the default for Madeline.

Please note that if we get contributions of localized message catalogs for other languages in the future, then other Unicode quotation mark characters could conceivably be used for other languages. For example, in French and many other languages, the angled guillemets «…» are commonly used; and in Chinese and Japanese the CJK 「…」or 『…』brackets are commonly used.

https://en.wikipedia.org/wiki/Quotation_mark provides a nice table of common conventions for many languages.

In order to find out "what is between the quotation marks?" in a language-agnostic way, you would want to use the Unicode Character properties. Quoting from the above-mentioned Wikipedia article, "In Unicode, 30 characters are marked Quotation Mark=Yes by character property.[34] They all have general category "Punctuation", and a subcategory Open, Close, Initial, Final or Other (Ps, Pe, Pi, Pf, Po)." One can use a unicode-aware regular expression parser to extract the relevant quoted text.

http://www.regular-expressions.info/unicode.html provides information on how to construct regular expressions using Unicode character properties.

Hope this helps!