sanskrit-lexicon / Wil-YAT

Comparison of Wilson and Yates Headwords
Other
0 stars 0 forks source link

Oddities in Yates Devanagari markup #3

Closed funderburkjim closed 7 years ago

funderburkjim commented 7 years ago

Two varieties of 'dot' in Devanagari

In the printed text, the Yates dictionary uses a 'dot' within Devanagari to indicate compound composition.

In the Cologne digitization, this 'dot' is coded in either as

funderburkjim commented 7 years ago

Suggest 'period' be changed to '-'

The period character coding indicates where two compound words are joined, and usually where the joining involves no sandhi.

While we're making some IAST and meta-line changes to Yates, I think we should change these periods in key2 to hyphens. Reason: the hyphen is used by MW in much the same way this period is used by Yates. However, in MW and other European dictionaries, this markup is done for the IAST spelling of the word -- Yates seems unique in putting such markup in the Devanagari headword entry.

Do others agree to change this period to hyphen?

funderburkjim commented 7 years ago

backslash for dot usu. means vowel sandhi

There is a visible difference in the printed placement of the Yates dot, for those which the digitization codes with a backslash. The general distinction appears to be that in such cases, the Yates dot is placed definitely UNDER a devagari letter (usually a vowel), rather than between two devanagari letters.

In MW, these examples are rendered (in IAST) with a circumflex over an 'a', which is MW's way of indicating a long vowel resulting from simple vowel sandhi (a + A -> A in the agAtmajA case); and the digitization of mw (mw.xml) represents this with a special empty xml tag <srs/> (simple-replacement-sandhi I think is the acronym Peter came up with). (agA<srs/>tmajA).

image

Why backslash is poor choice

This is a technical reason: we represent Devanagari with the SLP1 transliteration; and in the SLP1 transliteration, a vowel+backslash indicates an accent (anudAtta, in particular). (/ = udAtta, ^ = svarita).

Maybe use underscore _ in place of backslash?

I don't think _ has any meaning in SLP1 (no example comes to mind). So SLP1 is neutral on this; and also, in yat.txt, there is currently no instance of _.

We could, in the conversion to yat.xml, convert this _ to the MW convention <srs/> or just leave it as underscore; and we could choose in displays to either ignore it or display it in some way - probably ignore it since there is no generally accepted way to display it. The Cologne MW displays use an 'asterisk' character to display <srs/>, but this is ugly, and certainly not intuitive.

funderburkjim commented 7 years ago

Any opinions of these two arcane minutiae ?

drdhaval2785 commented 7 years ago

Suggest 'period' be changed to '-'

I agree.

drdhaval2785 commented 7 years ago

Maybe use underscore _ in place of backslash?

Agree.

gasyoun commented 7 years ago

hyphen is used by MW in much the same way this period is used by Yates

Agree.

Yates dot is placed definitely UNDER a devagari letter

I thought Hertel invented the method in his HOS Panchatantra edition, but it was Yates 100 years before. Adore the method.

simple-replacement-sandhi

Oh, that's how to read it...

The Cologne MW displays use an 'asterisk' character to display , but this is ugly, and certainly not intuitive.

Indeed.

funderburkjim commented 7 years ago

Have replaced the dots by hyphen and underscore, as described above.

gasyoun commented 7 years ago

Have replaced the dots by hyphen and underscore

And added a link to this issue inside the download folder readme, right? Changes need to be documented not only on the web, but a meta-list should be there as well. We tend to forget ourselves. Others are not even aware of most and can get into confusion.

funderburkjim commented 7 years ago

added a link to this issue

Mentioned it in yat-meta2.

Perhaps we should consider using the wiki feature of Github for such semi-permanent information about the dictionaries. e.g., in the Cologne repository wiki

gasyoun commented 7 years ago

Perhaps we should consider using the wiki feature of Github for such semi-permanent information about the dictionaries

Was thinking about a wider use of the wiki, exactly. Even I do not know what .txt should one read to understand how things are built. So much time is spent in documentation and it still remains for 5-10 people at best.