Closed funderburkjim closed 7 years ago
The period character coding indicates where two compound words are joined, and usually where the joining involves no sandhi.
While we're making some IAST and meta-line changes to Yates, I think we should change these periods in key2 to hyphens. Reason: the hyphen is used by MW in much the same way this period is used by Yates. However, in MW and other European dictionaries, this markup is done for the IAST spelling of the word -- Yates seems unique in putting such markup in the Devanagari headword entry.
Do others agree to change this period to hyphen?
There is a visible difference in the printed placement of the Yates dot, for those which the digitization codes with a backslash. The general distinction appears to be that in such cases, the Yates dot is placed definitely UNDER a devagari letter (usually a vowel), rather than between two devanagari letters.
In MW, these examples are rendered (in IAST) with a circumflex over an 'a', which is MW's way of indicating a long vowel resulting from simple vowel sandhi (a + A -> A in the agAtmajA case); and
the digitization of mw (mw.xml) represents this with a special empty xml tag <srs/>
(simple-replacement-sandhi I think is the acronym Peter came up with). (agA<srs/>tmajA
).
This is a technical reason: we represent Devanagari with the SLP1 transliteration; and in the SLP1 transliteration, a vowel+backslash indicates an accent (anudAtta, in particular). (/ = udAtta, ^ = svarita).
_
in place of backslash?I don't think _
has any meaning in SLP1 (no example comes to mind). So SLP1 is neutral on this; and
also, in yat.txt, there is currently no instance of _
.
We could, in the conversion to yat.xml, convert this _
to the MW convention <srs/>
or just leave it as underscore; and we could choose in displays to either ignore it or display it in some way - probably ignore it since there is no generally accepted way to display it. The Cologne MW displays use an 'asterisk' character to display <srs/>
, but this is ugly, and certainly not intuitive.
Any opinions of these two arcane minutiae ?
Suggest 'period' be changed to '-'
I agree.
Maybe use underscore _ in place of backslash?
Agree.
hyphen is used by MW in much the same way this period is used by Yates
Agree.
Yates dot is placed definitely UNDER a devagari letter
I thought Hertel invented the method in his HOS Panchatantra edition, but it was Yates 100 years before. Adore the method.
simple-replacement-sandhi
Oh, that's how to read it...
The Cologne MW displays use an 'asterisk' character to display
, but this is ugly, and certainly not intuitive.
Indeed.
Have replaced the dots by hyphen and underscore, as described above.
Have replaced the dots by hyphen and underscore
And added a link to this issue inside the download folder readme, right? Changes need to be documented not only on the web, but a meta-list should be there as well. We tend to forget ourselves. Others are not even aware of most and can get into confusion.
added a link to this issue
Mentioned it in yat-meta2.
Perhaps we should consider using the wiki feature of Github for such semi-permanent information about the dictionaries. e.g., in the Cologne repository wiki
Perhaps we should consider using the wiki feature of Github for such semi-permanent information about the dictionaries
Was thinking about a wider use of the wiki, exactly. Even I do not know what .txt should one read to understand how things are built. So much time is spent in documentation and it still remains for 5-10 people at best.
Two varieties of 'dot' in Devanagari
In the printed text, the Yates dictionary uses a 'dot' within Devanagari to indicate compound composition.
In the Cologne digitization, this 'dot' is coded in either as