Open funderburkjim opened 8 years ago
New horizons, too wide, too bright. Let's get back to correction of headwords :walking:
bibnew_disp2_edit.txt is intended to be a file that we edit to 'fill in the blanks' of the 'new' literary source references of PW.
Currently there are 287 of these, identified by the string title=
in the file.
Only one of these is determined at the moment (at line 47 of the file).
A thesis by Jachertz (in pdf and digitized forms). Note - I had problems viewing the pdf via the browser. It displays properly with Adobe Reader.
Beginning at line 405 of the digitization, there is a list of works, probably collated from (pw = PWK), (PW=PWG), and maybe MW (?).
Very brief usage suggests that some of our missing cases are likely mentioned here.
For instance at line 563 of the digitization, there is p><b>Ar4g.</b> s. Arjunasama1gama
,
which provides a resolution for 'ARG4' at line 92 of bibnew_disp2_edit.txt.
This example also illustrates some of the problems of doing this collation, for the Jachertz digitization mis-spells the abbreviation as 'Ar4g' instead of the pdf's 'Arg4' .
This example also illustrates some of the problems of doing this collation, for the Jachertz digitization mis-spells the abbreviation as 'Ar4g' instead of the pdf's 'Arg4' .
@funderburkjim So there are 287 cases to be checked in Jachertz?
@gasyoun The 287 cases are the unresolved literary source abbreviations in PWK.
One of the sources we might use to resolve these cases is Jachertz.
It might be that the references in MW will also help us to resolve the PWK unknowns. That is why the MW references are interspersed in the bibnew_disp2_edit.txt. For instance, the two unresolved cases 'ANUPADA' and 'ANUPADAS' are likely to be 'anupada-sūtra ' which appears as an MW literary source.
Thanks. Just today had a fight. One man told that there are no unknown abbreviations in PW(G-K). I had to laugh.
The man can easily prove us wrong by filling in the blanks in bibnew_disp2_edit.txt :)
This issue documents some work which may be of use in connection with #56. The work is done in the pwbib_new_work directory of this repository.
The redo.sh script computes the various results.
mergebibnew.txt This file merges pwbib1 (the PW bibliography references) and pwbib_new (the references which are needed to resolve literary source references appearing in the digitization pw.txt of the dictionary).
The records are sorted by abbreviation (ignoring Anglicized-Sanskrit-numbers, and capitalization). If some of the unknown abbreviations of pwbib_new are spelling variants of references already in the bibliography, this ordering may emphasize this fact. Also, the handful of duplicates in the bibliography (that occur in different volumes of the text), will be identified.
There is a lot of data to examine here. Again, the main focus is to provide clues to what the 'title' should be for the 'new' PW bibliographic entries that have been uncovered by our matching regimen. Preliminary examination of bibnew_disp2 suggests that some fraction of the unknown titles may be readily inferred.
Also, some of the unknown abbreviations may be suggestive to those of us familiar with the Sanskrit corpus.