sanskrit-lexicon / COLOGNE

Development of http://www.sanskrit-lexicon.uni-koeln.de/
18 stars 3 forks source link

CLI tool for transliteration #350

Open drdhaval2785 opened 3 years ago

drdhaval2785 commented 3 years ago

Noting a recently developed command line tool based on indic_transliteration python package.

https://github.com/vipranarayan14/sanscript-cli

Thought it would be useful for us to convert full sized texts from one to another transliteration in CLI.

funderburkjim commented 3 years ago

Current transcoding methodology

Dhaval - Want to be sure you are aware of current methodology, based on transcoder.py.

A recent usage example generated an IAST version of MW: See mw_transcode example described in this readme.

An interesting exercise would be to duplicate mw_transcode.py functionality using one of the tools you mention. Maybe you can give this a try when your time permits.

Pluses for indic_transliteration

There are more Indian language transcodings already worked out. These could be done with transcoder.py (e.g. SLP1 to tamil), but require development of new transcoder mappings (e.g. slp1_tamil.xml).

Also, if indic_transliteration provides conversion from any transcoding X to any other transcoding Y, then this flexibility exceeds the current state of transcoder.py.

A possible minus of indic_transliteration

Does indic transliteration handle Devanagari accents (udAtta, etc.) ? slp1_deva.xml and deva_slp1.xml does handle this.

Creation of language versions of the dictionaries with accents (MW, PWG, a couple of others) could be done by simply ignoring the accents; and such accent-free versions might be useful.

grantha

There is a special issue with grantha. At Thomas' request, I worked on slp1_grantha transcoding a few years ago (can find it if of interest). At that time, there was required a special hard-to-get font; and there was something odd about how the unicode had to be constructed in order to work with the font.

transcoder.php

One advantage of the current transcoder approach is that there is a php version; this version uses the same transcoder files (like slp1_deva.xml) as transcoder.py.
I wonder if the mappings of indic_transliteration could be converted to transcoder files? If so, we could add other languages to the dictionary web displays.

Peter Scharf has developed some transcodings not present in the Cologne system: See https://sanskritlibrary.org/preferences.html. He uses transcoding files which are similar to, but slightly different in format when compared to the Cologne transcoding files.

Javascript transcoding

The list display at Cologne uses a javascript version of transcoding: transcoder.js.

This transcoding is also based on the transcoder xml files. Not sure if indic_transliteration also has compatible Javascript version.

Note: Hope you are well. Our thoughts and prayers are with you especially during the great Covid difficulty now in India.

gasyoun commented 3 years ago

slp1_grantha transcoding a few years ago (can find it if of interest)

Every single line of your work is of interest for me.

funderburkjim commented 3 years ago

uploaded 'granthawork' in sanskrit-transcoding repository.

There are numerous html examples. See https://github.com/funderburkjim/sanskrit-transcoding/issues/4

gasyoun commented 3 years ago

uploaded 'granthawork' in sanskrit-transcoding repository.

Wonder how much un-unploaded code still remains...

vipranarayan14 commented 3 years ago

Noting a recently developed command line tool based on indic_transliteration python package.

vipranarayan14/sanscript-cli

Thought it would be useful for us to convert full sized texts from one to another transliteration in CLI.

@drdhaval2785 I am the developer of the CLI. It has been merged into the main indic_transliteration package. Installing the package will install the CLI tool also. Please refer sanskrit-coders/indic_transliteration#cli for more information. I am going to archive vipranarayan14/sanscript-cli repository.

drdhaval2785 commented 3 years ago

Thanks for the update. It is great that CLI is in the same repository with package.