sanskrit-lexicon / PWK

Sanskrit-Wörterbuch in kürzerer Fassung, 7 Bände Petersburg 1879-1889
3 stars 1 forks source link

appending pwkvn to pwk #104

Closed funderburkjim closed 4 months ago

funderburkjim commented 4 months ago

This issue devoted to an attempt to integrate pwkvn into pw in cdsl

This integration has been advocated in other issues, such as [this comment] (https://github.com/sanskrit-lexicon/PWK/issues/86#issuecomment-1113909425) from @Andhrabharati.

Guess it should go to the pw.txt appended at the end, with continuing L-numbers, if not at the end of each volume (resp. portions) as in pwg.txt.

There are at least 3 plausible supplement integration approaches:

You can use the cdsl displays to see how the examples appear .

funderburkjim commented 4 months ago

the append option

The MD approach is the simplest (but not simple) to implement for PWKVN supplements to PWK.

PW is a multi-volume work, and the supplements for a given volume are inserted at the end of the main entries of each volume (with the volume 7 supplements also including mention of supplements in volumes I-VI).

For volume 1 pwkVN, there are about 1800 entries. The last entry of volume 1 pw is L=23117 Ozmya, and the first entry of volume 2 pw is 23118 ka. And the last entry of volume 7 PW is L=135787, hvAla.

Recall that one of the cdsl requirements for L-numbers is permanence. We have two options for where to append the entries from pwkvn vol 1

And continue similarly for inserting the other pwkvn.

The technically simplest would be to insert all PWKVN entries after the end of volume 7 of pw, so L=200001 to L=222000 (since there about 22000 pwkvn entries. Also, this numbering should make easier the generation of alternate headword entries from pwkvn. This is my choice.

If there are objections to this choice, please mention promptly.

gasyoun commented 4 months ago

If there are objections to this choice, please mention promptly.

None. If L=23117.1800 can get us in trouble, let's leave it. The L=222000 approach loosts some hints, but if quicker, let it come.

drdhaval2785 commented 4 months ago

No objections

Andhrabharati commented 4 months ago

I am glad that my working on pwk and pwkvn is finally getting combined at last. [This might bring me back into 'mood', to resume my working at CDSL again.]

I also wish that the pwk 'header' portions (before the broken bar) for grouped entries be 'populated' into comma-separated lists in k2-field, and then expanded in xml file as multiple HWs, as done in GRA and pwkvn. [The unexpanded entry count totals just below 7k.]

@funderburkjim do you like to take this up and need any addl. work from me?

funderburkjim commented 4 months ago

comment on the 7k alt hws

Just for reference, to give a first understanding of where the 7k count comes from

A similar examination of pwkvn shows about 1.5k.

Inference: the handling of alternate headwords can be be done AFTER the merge of pwkvn into pw.

drop the <e> item in pw?

The metalines in pw have an <e>N.
Example: <L>47454<pc>3-049-c<k1>tritva<k2>tritva<e>100

I am fairly certain these come from the original provided by @maltenth, and that N is a code for the 'entry type' (e.g. 100 = Substantive).

This information is not explicit in the printed text. AFAIK, there is no use of this information. This classification is implicit in the <lex> and markup of pw

Propose: drop this from pw.txt. Probably a file table (L=47454, N=100) should be saved before dropping.

funderburkjim commented 4 months ago

The appending is done. cf. the above commits.

Plan for next steps to be discussed in separate issues

funderburkjim commented 4 months ago

Example display for a 'new' PW headword:

image
funderburkjim commented 4 months ago

Another sample display:

image