indic-transliteration / sanscript.js

Transliteration package for Indian scripts
MIT License
98 stars 39 forks source link

Change names of Brahmic Scripts as per Unicode #36

Open ManasMadrecha opened 3 years ago

ManasMadrecha commented 3 years ago

Why?

  1. Unicode already has done the standardization of what the scripts should be called.
  2. It is duplicate work to maintain a separate Array of "sanscript" specific names of scripts.
  3. Using Intl.DisplayNames in JS, we can easily get the standard names of the script. These can then be programmatically inserted into Sanscript.t(text, "here", "here")
  4. Unicode specific scripts' names can also be easily inserted into HTML's lang attribute like hi-Deva or hi-Latn.
  5. Having sanscript's scripts' names like devanagari, bengali, etc. serves no purpose. They cannot be used inside HTML's lang. Also, we if anyway want the full names of the scripts we can always use Intl.DisplayNames. But this doesn't work with sanscript's scripts' names
  6. For the scripts which doesn't have Unicode support, only those should be kept the same as being currently used, as they will never be used inside HTML's lang anyway, e.g. sanskritOCR

Approach

Of course, this will be a breaking change, so for the time being, you can simply copy+paste the current .json files inside the brahmic folder into Unicode-based names.

For e.g., for gujarati, bengali, etc. scripts, you can create new Gujr.json, Beng.json files with the same content as the gujarati.json, bengali.json, respectively.

vvasuki commented 3 years ago

Let's make this a non-breaking change by ensuring the following:

If the above is clear, please go ahead and send a pull request.