Closed funderburkjim closed 4 months ago
The MD approach is the simplest (but not simple) to implement for PWKVN supplements to PWK.
PW is a multi-volume work, and the supplements for a given volume are inserted at the end of the main entries of each volume (with the volume 7 supplements also including mention of supplements in volumes I-VI).
For volume 1 pwkVN, there are about 1800 entries. The last entry of volume 1 pw is L=23117 Ozmya, and the first entry of volume 2 pw is 23118 ka. And the last entry of volume 7 PW is L=135787, hvAla.
Recall that one of the cdsl requirements for L-numbers is permanence. We have two options for where to append the entries from pwkvn vol 1
And continue similarly for inserting the other pwkvn.
The technically simplest would be to insert all PWKVN entries after the end of volume 7 of pw, so L=200001 to L=222000 (since there about 22000 pwkvn entries. Also, this numbering should make easier the generation of alternate headword entries from pwkvn. This is my choice.
If there are objections to this choice, please mention promptly.
If there are objections to this choice, please mention promptly.
None. If L=23117.1800
can get us in trouble, let's leave it. The L=222000
approach loosts some hints, but if quicker, let it come.
No objections
I am glad that my working on pwk and pwkvn is finally getting combined at last. [This might bring me back into 'mood', to resume my working at CDSL again.]
I also wish that the pwk 'header' portions (before the broken bar) for grouped entries be 'populated' into comma-separated lists in k2-field, and then expanded in xml file as multiple HWs, as done in GRA and pwkvn. [The unexpanded entry count totals just below 7k.]
@funderburkjim do you like to take this up and need any addl. work from me?
Just for reference, to give a first understanding of where the 7k count comes from
^<hom>[0-9]+\.</hom> [*]?{#.*?#}¦
A similar examination of pwkvn shows about 1.5k.
Inference: the handling of alternate headwords can be be done AFTER the merge of pwkvn into pw.
<e>
item in pw?The metalines in pw have an <e>N
.
Example: <L>47454<pc>3-049-c<k1>tritva<k2>tritva<e>100
I am fairly certain these come from the original provided by @maltenth, and that N is a code for the 'entry type' (e.g. 100 = Substantive).
This information is not explicit in the printed text.
AFAIK, there is no use of this information.
This classification is implicit in the <lex>
and √
markup of pw
Propose: drop this from pw.txt. Probably a file table (L=47454, N=100) should be saved before dropping.
The appending is done. cf. the above commits.
Plan for next steps to be discussed in separate issues
<e>N
Example display for a 'new' PW headword:
Another sample display:
This issue devoted to an attempt to integrate pwkvn into pw in cdsl
This integration has been advocated in other issues, such as [this comment] (https://github.com/sanskrit-lexicon/PWK/issues/86#issuecomment-1113909425) from @Andhrabharati.
There are at least 3 plausible supplement integration approaches:
<info n="sup"/>
(additional headword) example aMSarUpiRI<info n="rev"/>
(revision or addition to main entries) (exampleaMSu
, L=52a ray, sunbeam
)agni
)yajYArTam
)You can use the cdsl displays to see how the examples appear .