rosettatype / hyperglot

Hyperglot: a database and tools for detecting language support in fonts
http://hyperglot.rosettatype.com
GNU General Public License v3.0
162 stars 22 forks source link

Resolve use of apostrophe, single quote, and other similar letter-likes #47

Closed kontur closed 2 years ago

kontur commented 3 years ago

I think one issue throughout the data is that different sources and data input has handled orthographies with an "apostrophe"-looking character.

The data might have:

...and probably more (even combining marks).

While there might be proper canonical information for some orthographies, it seems to me that this is most likely arbitrary based on sources and data input and should probably be canonized or disambiguated in some way. E.g. we might want to unify how such characters are input in the data, and we might want to disambiguate this character so that several "possible alternatives" satisfy a language check.

It is further questionable if those should be treated as character, or if those would make a good case for required punctuation, or what indeed their role is.

Ping @MrBrezina

meehkal commented 3 years ago

It is further questionable if those should be treated as character, or if those would make a good case for required punctuation, or what indeed their role is.

This depends on the language. In Kildin Saami (single apostrophe) or Nenets (double apostrophe) they should be regarded letter characters, because they have sound equivalents.

kontur commented 2 years ago

Closing this, in favor of #82