kmadathil / sanskrit_parser

Parsers for Sanskrit / संस्कृतम्
MIT License
68 stars 21 forks source link

s+s sandhi split not happening #184

Closed sumanthegde closed 1 year ago

sumanthegde commented 1 year ago

निस्सारः is not split whereas निस्तेजः is.

INFO:sanskrit_parser.api:Input String in SLP1: nissAra ~/Library/Python/3.8/lib/python/site-packages/sanskrit_parser/api.py:149: UserWarning: No splits found. Please check the input to ensure there are no typos. warnings.warn("No splits found. Please check the input to ensure there are no typos.")

INFO:sanskrit_parser.api:Input String in SLP1: nistejaH [([nistejas], -1), ([nis, tejas], -2), ([ni, ste, jas], -3)]

kmadathil commented 1 year ago

Thanks for reporting this Sumant. I can reproduce this. Avinash, could you take a look? sAraH is in the dictionary and nis is too.

Sumant is building a Chrome extension for Sanskrit learners and wants to use sanskrit_parser. Including the twitter thread of our discussion for reference.

sumanthegde commented 1 year ago

Great. Btw, given the UX aspect of the Chrome extension, I totally upvote this <100ms target ticket!

I'm also curious about the ṣatva and ṇatva support. Consider गणपतिमभिषिञ्चति (gaṇapatimabhiṣiñcati). Splitting गणपतिम् अभिषिञ्चति is enough for my my app since its focus is on kṛdanta's and tiṅanta's. गणपतिम् अभि सिञ्चति would be even better. Similarly for ṇatva cases.

kmadathil commented 1 year ago

This should be fixed by (PR 186 ) which has been merged.