ambuda-org / vidyut

Infrastructure for Sanskrit software. For Python bindings, see `vidyut-py`.
49 stars 21 forks source link

vidyut-lipi doesn't seem to support Vedic accents correctly #86

Closed deepestblue closed 9 months ago

deepestblue commented 9 months ago

I'm not sure what the support for Vedic accents are, but even basic support seems to be lacking.

I tried transliterating 𑌸॒𑌹𑌸𑍍𑌰᳴𑌶𑍀𑌰𑍍𑌷𑌾॒ 𑌪𑍁𑌰𑍁᳴𑌷𑌃 । from Grantha to Devanagari.

Expected: स॒हस्र॑शीर्षा॒ पुरु॑षः । Actual: स॒हस्र᳴शीर्षा॒ पुरु᳴षः ।

akprasad commented 9 months ago

Thanks for filing this issue! Basic Devanagari anudatta and svarita are supported from ITRANS, but that's about it right now.

I notice that your Grantha text uses U+1CF4 (VEDIC TONE CANDRA ABOVE) where Devanagari uses U+0951 (DEVANAGARI STRESS SIGN UDATTA). I'm happy to create this mapping -- do you have a source (no matter how technical, though English is preferred) that discusses the use of Grantha accent so that I can link it from the comments?

akprasad commented 9 months ago

@deepestblue following up here, as I am fixing some other accent behavior (e.g. in #89).

deepestblue commented 9 months ago

Sorry, didn't get notified for the reply until you tagged me specifically.

https://unicode.org/L2/L2009/09372-grantha.pdf, Sec 4.3

akprasad commented 9 months ago

I've locally updated my documentation and fixed this bug as well. I'll close this issue when the fix is merged.

Thanks for filing these bugs, and I would be very grateful if you file more.

akprasad commented 9 months ago

@deepestblue what is the Grantha equivalent of अ᳚ (double svarita)? I could not find it after spending some time with the doc you linked. Aksharamukha has 𑌅॑ but I'm not sure why.

deepestblue commented 9 months ago

It's actually U+0951, and is called dirghasvarita in the doc.

Essentially, Devanagari क॒ round-trip maps to Grantha 𑌕॒, क॑ to 𑌕᳴, and क᳚ to 𑌕 ॑

deepestblue commented 9 months ago

Thanks for filing these bugs, and I would be very grateful if you file more.

Happy to, but I don't know Rust. Is there an easy way for me to run vidyut-lipi alone from a Unix commandline?

akprasad commented 9 months ago

Fixed and pushed. My testing was limited to anudatta, svarita, and dirgha svarita. Please re-open this issue if the issue persists or if there are more accent types you think are important to consider here.

Is there an easy way for me to run vidyut-lipi alone from a Unix commandline?

Yes:

# Install Rust (via https://www.rust-lang.org/tools/install)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

git clone --depth 1 https://github.com/ambuda-org/vidyut.git
cd vidyut/vidyut-lipi
cargo run --release --bin lipi -- --from slp1 --to grantha "namaste loka"

# After the first build, the binary will be available at ../target/release/lipi`
# (relative to `vidyut-lipi` directory)

I will also keep the online demo reasonably up-to-date, if you find that more convenient.

akprasad commented 9 months ago

Oh, and adding a ping for visibility -- @deepestblue