Closed chakrabortydeepro closed 2 years ago
@chakrabortydeepro
I don't think tags would have ever been preserved. Due to security, I remove most of the tags before converting the text. I suppose I could add TEI tags to the exception list.
Also: Have you tried using the file converter and uploading your XML there? It should preserve the tags (for Indic input at least).
Since you have already did something to handle the tags (removing them quietly in the output) you could have an option for no change of the tags when converting from Roman to other scripts.
This becomes complicated. I need to parse the XML and only convert the contents while preserving the tags and the attributes. I'll add it to my long list of TODOs and will visit it at some point :)
Vinodh
Thanks, Vinodh. Yes, file converter works. I probably then converted through file upload previously.
@chakrabortydeepro Could you please send me a sample TEI xml file with roman input?
V
Here is one.
On Wed, Aug 4, 2021 at 3:39 PM Vinodh Rajan @.***> wrote:
@chakrabortydeepro https://github.com/chakrabortydeepro Could you please send me a sample TEI xml file with roman input?
V
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/virtualvinodh/aksharamukha/issues/143#issuecomment-892992840, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIIRWBPNDGAT222QST55PO3T3GXSDANCNFSM5BMICNNA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .
Suśruta project, University of Alberta
Folios 4r and 4v are missing|
Folio no 12r and 12v missing| Additions SS|1|13|21ab SS|1|13|21cd SS|1|13|23ef SS|1|13|23gh
athātaḥ śiṣyopanayanīyam adhyāyaṃ
vrāhmaṇakṣatriyavaisyānām anyatamam anvayaḥ | vayaḥ
śūdram api guṇavantam anupanītam adhyāpayed ity eke |
upanayanīyan tu brāhmaṇaṃ praśa
thaktāni dārvihomikenāgnim
upasamādhāyā
You can extract a piece that transliterates XML from https://github.com/twardoch/udhr-custom/blob/main/tools/aksharamukha_transliterate.py
I'll probably contrib a small snippet.
@chakrabortydeepro I have fixed this. Should be available in two weeks with the next update.
Thanks, Vinodh. I'll send you the examples of the Bengali b soon.
On Thu, May 19, 2022 at 12:46 PM Vinodh Rajan @.***> wrote:
@chakrabortydeepro https://github.com/chakrabortydeepro I have fixed this. Should be available in two weeks with the next update.
— Reply to this email directly, view it on GitHub https://github.com/virtualvinodh/aksharamukha/issues/143#issuecomment-1132073257, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIIRWBLO3WON6WSSTJ4SHRTVK2D7JANCNFSM5BMICNNA . You are receiving this because you were mentioned.Message ID: @.***>
-- Deepro Chakraborty (he/him) PhD candidate Department of History, Classics, and Religion University of Alberta
The University of Alberta acknowledges that we are located on ᐊᒥᐢᑿᒌᐚᐢᑲᐦᐃᑲᐣ (Amiskwacîwâskahikan) Treaty 6 territory, and respects the history, languages, and cultures of the First Nations, Métis, Inuit, and all First Peoples of Canada, whose presence continues to enrich our institution.
@virtualvinodh Dear Vinodh, Earlier the tags was appearing while I was converting the texts but today suddenly I see the tags disappear. Is there. For example,
used to be converted (Devanāgarī to Roman) as
but now it converts just as
Is there any way to retain the tags?
It would be also very useful if there if Aksharamukha handles the tei tags in a bit more sophisticated way. It was very useful when I was converting Devanāgarī into Roman but converting from Roman into other scripts with the tags retained verbatim was not possible.
You may consider the following suggestions: