sanskrit-lexicon / PWG

Boehtlingk und Roth Sanskrit Wörterbuch, 7 Bände Petersburg 1855-1875
0 stars 0 forks source link

single-letter italic #56

Closed funderburkjim closed 2 years ago

funderburkjim commented 2 years ago

There are several hundred instances in pwg.txt of single-letters marked with italics. For instance, under headword 'antarA', {#antarA/#}¦ ({#antar + A#}; vgl. u. {#antar#} 2, {%a%}, am Ende) This is interpreted as see under 'antar' at the end of section 2a. And indeed when we look at the end of section 2a of entry for antar, we see a usage of 'antarA': <ls>ṚV. 9, 67, 23.</ls> {#trI za pa\vitrA^ hf\dya1^\ntarA da^De#}

Suggestion: At least for such cross references, we should remove the italic markup around the single letter. e.g.

OLD
{#antarA/#}¦ ({#antar + A#}; vgl. u. {#antar#} 2, {%a%}, am Ende) 
NEW
{#antarA/#}¦ ({#antar + A#}; vgl. u. {#antar#} 2, a, am Ende) 

Not all single-letter italics are like this. e.g. I think the italic markup around 'r' should be retained in

so heisst der <is>saṃdhi</is>, wenn der {#rePin#} 
vor Vocalen und weichen Consonanten {%r%} wird,

Probably it is safe to remove the italic markup when the letter follows a number, such as

266 matches in 233 lines for "[0-9], {%.%}" in buffer: temp_pwg_2.txt
  EXAMPLE:  the antarA instance above
92 matches for "[0-9]) *{%.%}" in buffer: temp_pwg_2.txt
  EXAMPLE: `(u. {#Gawika#} 2) {%a%}),`

@Andhrabharati What do you think of this small proposed change in pwg?

Andhrabharati commented 2 years ago

Pl. retain those single-letter italics, for it is what the print has.

funderburkjim commented 2 years ago

OK. Will keep those for now.

Andhrabharati commented 2 years ago

@funderburkjim Just had a look again at the PWG data.

The italic letters refer to the meaning number hierarchy marked as <div n="n"> inside pwg.txt, 1 for Indo-Arabic, 2 for Roman and 3 for Greek.

I had made them all (2 and 3) plain italic letters, the '2' count being over 23000! So, it is not 266, but almost 100 times more.

Probably the simplified marking as done in my version could be implemented in the cdsl text too.

Also the greek tags could be removed in such reference places (count: 18)

<lang n="greek">α)</lang>
<lang n="greek">(α)</lang>
<lang n="greek">(α</lang>)
<lang n="greek">β)</lang> : 3
<lang n="greek">(β</lang>)
<lang n="greek">γ)</lang> : 3
<lang n="greek">δ)</lang>
<lang n="greek">(δ</lang>
<lang n="greek">ε)</lang>
<lang n="greek">(ε)</lang> : 2
<lang n="greek">(ζ)</lang> : 2
<lang n="greek">η)</lang>