interscript / interscript-ruby

Interoperable script conversion systems (ISCS) with the `interscript` gem
Other
11 stars 30 forks source link

Introduce concept of "output character set" #633

Open ronaldtse opened 4 years ago

ronaldtse commented 4 years ago

The systems of #154 #106 #86 #93 all offer conditions:

  1. If the output character set supports X, do not convert X.
  2. If the output character set does not support X, transliterate X to Y.

This means we need to incorporate the concept of "output character set", which is an attribute of a "spelling system". This means we need to differentiate "supported characters in a system" (to convert), "unsupported/foreign characters" (possibly not convert), and "conditional support per conversion rule".

This also means we need to support the concept of "spelling system". For example, the "modern English" system supports the [0-9a-zA-Z] (plus punctuation) characters. If we wanted to use the conversion system of #93 (German) to "modern English", we will need to render "β" as "ss".

Thoughts?

Originally posted by @ronaldtse in https://github.com/interscript/interscript/issues/106#issuecomment-728642790