creativecommons / cc-legal-tools-app

Legal tool (licenses, public domain dedication, etc.) management application for Creative Commons
https://creativecommons.org/licenses/
MIT License
85 stars 85 forks source link

[Feature] Use smart apostrophes and smart quotes in legal code #297

Open Jayman2000 opened 2 years ago

Jayman2000 commented 2 years ago

Problem

According to the Unicode Standard 15.0.0, “2019 is preferred for apostrophe”. The standard also says, "preferred characters in English for paired quotation marks are 2018 & 2019 ". U+2019 () is sometimes called the “smart apostrophe”, and U+201C () and U+201D () are sometimes called “smart quotes”. Similarly, U+0027 (') is sometimes called the “ASCII apostrophe”, and U+0022 (") is sometimes called the “ASCII quotation mark”.

The English legal code for the current legal tools is inconsistent. In the licenses, ASCII apostrophes and ASCII quotation marks are almost always used. The only exception is the text above the “Creative Commons … International Public License” heading. The HTML versions of that text use smart apostrophes and smart quotes, but the plain text versions of that text use ASCII apostrophes and ASCII quotation marks.

In CC0, ASCII apostrophes and ASCII quotation marks are always used.

Description

Always use smart apostrophes and smart quotes. This would make the legal code consistent with itself and consistent with The Unicode Standard.

Alternatives

Always use ASCII apostrophes and ASCII quotation marks. This would make the legal code consistent with itself but wouldn’t make the legal code consistent with the Unicode standard. Additionally, ASCII quotation marks are very slightly harder to read than smart quotes since opening ones and closing ones look the same.

Implementation

TimidRobot commented 6 months ago
possumbilities commented 6 months ago

I'll add for any future work done here, that from a pure technical standard the argument for U+2019 may be inline with the Unicode spec (same for the double quotes, and closing single), but often times specs do not adequately capture the complexity of the world and languages therein. The argument around smart quotes is not a new one, and not a settled matter. There are reasonable arguments that adoption of U+2019 is actually to spec, but bad practice because of unintended consequences in a myriad of instances related to and adjacent to this one.

The link above also gets into some trouble you'll encounter with non-English language contexts.

Smart quotes are not handled uniformly by all word processors, and create all manner of troubles when moving text around. They also depend on a matched set to do correctly and not all software handles them correctly. It adds a lot of complexity to do right. Whereas U+0027 should allow for the same character as the start and beginning on encapsulation. (same for double quotes).

There have also been arguments that since quotes are such a core part of the content of a material in question its backwards compatibility should be paramount. The full ASCII spec is held within Unicode, so any character use derived from ASCII would be be bundled in an environment implementing modern Unicode, but due to the vast nature of Unicode, there is always a chance that a legacy system may only support ASCII content and as a result characters outside those bounds may fail to display or be processed correctly.

This is likely one of those instances where going against the spec produces the most reliable and compatible results.

This is very much an old conflict between technical specs, technical compatibility, human symbols/languages, typographic symbols, grammatical standards, writing "style" standards, accessibility standards, overall UX standards; and trying to make all of those co-exist when they are often shaped by very different groups and forces.

My opinion: Technical standards are a factor, but not always the deciding factor, especially when considering the larger context of use, reuse, and composition. Whatever we do here I hope we can strike the right balance. And that whatever we end up doing it's contextually relevant and documented somewhere.