Closed twardoch closed 3 years ago
Ps. Of course stub/replacement modules can be provided, as it’s already done.
Ps. @tatuylonen has https://github.com/tatuylonen/wiktextract/ and https://github.com/tatuylonen/wikitextprocessor/ — I’ve opened an issue https://github.com/tatuylonen/wiktextract/issues/68 where I inquired about updating the Wiktionary transliteration modules.
I see that tatuylonen’s modules also use some "safe" workarounds to run Lua code. Perhaps your Wiktra could somehow integrate with wikitextprocessor/wiktextract so that the same techniques are used?
I see that it’s possible to do https://en.wiktionary.org/w/index.php?title=Module:scripts&action=raw
I’ve created a Python tool that downloads Wiktionary modules: https://github.com/twardoch/wiktra-update
I made this into a pull request https://github.com/kbatsuren/wiktra/pull/4 from my own fork https://github.com/twardoch/wiktra/
Thank you so much for the detailed suggestions. I will have a look at it as soon as I come back from my vacation.
OK, I think I can close this now :)
How feasible would it be to update this to cover all Wiktionary transliterations, and possibly expose additional functionality?
Update to cover all transliteration modules: https://en.wiktionary.org/wiki/Category:Transliteration_modules
Change the API so we can specify the input ISO 639 language code and/or ISO 15924 script code, possibly only the ISO script code without specifying the language.
Change the API so we can optionally specify the output language and/or script code, so we can transliterate to other scripts/languages, not just Latin, using modules like https://en.wiktionary.org/wiki/Module:Newa-Deva-translit
It would be best that language and script lookups would be done using native Lua modules (see dependencies below)
Provide instructions or some script that automatically updates the Lua components from Wiktionary (I don’t know how this is done).
Possible dependencies
Integrate
languages
:Integrate
families
:Integrate
scripts
:Integrate
translit-redirect/data
Dependencies: