drdhaval2785 / SanskritSorting

Codes written by Dr. Dhaval Patel for Sanskrit Natural Language Programming
2 stars 1 forks source link

Special Case: Sort Likhushina's Root List #40

Open gasyoun opened 9 years ago

gasyoun commented 9 years ago

@drdhaval2785 since 2011 I'm working on a Sanskrit reader. The InDesign layout part and editing was finished today. There is an InDesign javascript that extracts all the roots from the book and add the page number. By default it is in English alphabet order.

√arṣ — 102
√as — 44
ati-√kram — 83, 134
ati-√muc — 106
ati-√ric — 85
ati-√vart (√vṛt) — 145
vi-√rah — 66, 120
vi-√ram — 125, 152
vi-√ru — 167
vi-√sarp (√sṛp) — 114

Can we sort it Devanagari properly, please? (√vṛt) so anything inside () should not be involved in sorting, same as - and .

drdhaval2785 commented 9 years ago

@gasyoun Didnt we agree that input is in SLP1. The more the transliterator code - more the issues.

gasyoun commented 9 years ago

Sure, I'll convert the IAST in book to SLP1, that's not an issue at all.

√arz — 102
√as — 44
ati-√kram — 83, 134
ati-√muc — 106
ati-√ric — 85
ati-√vart (√vft) — 145
vi-√rah — 66, 120
vi-√ram — 125, 152
vi-√ru — 167
vi-√sarp (√sfp) — 114
drdhaval2785 commented 9 years ago

How does the following look ?

ati-√kram — 83, 134
ati-√muc — 106
ati-√ric — 85
ati-√vart (√vṛt) — 145
√arṣ — 102
√as — 44
vi-√ram — 125, 152
vi-√rah — 66, 120
vi-√ru — 167
vi-√sarp (√sṛp) — 114

This is the output of multi13.php without altering any code. If there is need, we can mend the code.

gasyoun commented 9 years ago

I would want to know if an option of sorting by the word under √ is possible? For example, to sort under √muc all the upasargas?

drdhaval2785 commented 9 years ago

sample list and sample sorted list please

gasyoun commented 9 years ago

Input:

ati-√kram — 83, 134
ati-√muc — 106
apa-√muc — 108
ati-√vart (√vṛt) — 145
√arṣ — 102
√as — 44
vi-√ram — 125, 152
vi-√rah — 66, 120
vi-√ru — 167
vi-√sarp (√sṛp) — 114
vy-ati-√muc — 106

Output:

√kram — 83
ati- 134
√muc
ati- 106
apa- 108
√vart (√vṛt)
ati- 145
√arṣ — 102
√as — 44
√ram
vi- 125, 152
√rah
vi- 66, 120
√ru
vi- 167
√sarp (√sṛp)
vi- 114
√muc
vy-ati- 106
gasyoun commented 9 years ago

Not satisfied with the sample?

drdhaval2785 commented 9 years ago

Not on high priority