sanskrit-lexicon / COLOGNE

Development of http://www.sanskrit-lexicon.uni-koeln.de/
18 stars 3 forks source link

sch.xml issues #122

Open drdhaval2785 opened 7 years ago

drdhaval2785 commented 7 years ago
<H1><h><key1>U</key1><key2>U1</key2></h><body>[Schµ9027] [Page122a.1] €1</body><tail><L>8908</L><pc>122-3</pc></tail></H1>

There is no content here ?

drdhaval2785 commented 7 years ago

There are 7 such cases.

gasyoun commented 7 years ago

There are 7 such cases.

I would delete such markup, @funderburkjim

funderburkjim commented 7 years ago

These seem to be letter- headings, miscoded as headwords.

for letters U,E,O,C,J, B,S,z (SLP1)

Also. 'w' whose only body contains the other cerebral consonants.

In some dictionaries, the digitization has a special notation (such as <H>U) for letter breaks.
But since this is not done in Schmidt, I agree it is best to just remove these pseudo headwords.

gasyoun commented 7 years ago

But since this is not done in Schmidt, I agree it is best to just remove these pseudo headwords.

Or add a new tag? Do I get it right you speak about these places?

ooa

funderburkjim commented 7 years ago

Right -- those are the 'pseudo' headwords I was mentioning.

It doesn't seem worthwhile introducing a new tag for the letter breaks.

gasyoun commented 7 years ago

It doesn't seem worthwhile introducing a new tag for the letter breaks.

But they do occur in other dictionaries as well. We sometimes have a tag for even lesser cases.