sanskrit-lexicon / PWK

Sanskrit-Wörterbuch in kürzerer Fassung, 7 Bände Petersburg 1879-1889
3 stars 1 forks source link

PW bib new work #61

Open funderburkjim opened 8 years ago

funderburkjim commented 8 years ago

This issue documents some work which may be of use in connection with #56. The work is done in the pwbib_new_work directory of this repository.

The redo.sh script computes the various results.

There is a lot of data to examine here. Again, the main focus is to provide clues to what the 'title' should be for the 'new' PW bibliographic entries that have been uncovered by our matching regimen. Preliminary examination of bibnew_disp2 suggests that some fraction of the unknown titles may be readily inferred.

Also, some of the unknown abbreviations may be suggestive to those of us familiar with the Sanskrit corpus.

gasyoun commented 8 years ago

New horizons, too wide, too bright. Let's get back to correction of headwords :walking:

funderburkjim commented 8 years ago

bibnew_disp2_edit.txt is intended to be a file that we edit to 'fill in the blanks' of the 'new' literary source references of PW.

Currently there are 287 of these, identified by the string title= in the file.

Only one of these is determined at the moment (at line 47 of the file).

funderburkjim commented 8 years ago

A thesis by Jachertz (in pdf and digitized forms). Note - I had problems viewing the pdf via the browser. It displays properly with Adobe Reader.

Beginning at line 405 of the digitization, there is a list of works, probably collated from (pw = PWK), (PW=PWG), and maybe MW (?).

Very brief usage suggests that some of our missing cases are likely mentioned here.

For instance at line 563 of the digitization, there is p><b>Ar4g.</b> s. Arjunasama1gama, which provides a resolution for 'ARG4' at line 92 of bibnew_disp2_edit.txt.

This example also illustrates some of the problems of doing this collation, for the Jachertz digitization mis-spells the abbreviation as 'Ar4g' instead of the pdf's 'Arg4' .

gasyoun commented 8 years ago

This example also illustrates some of the problems of doing this collation, for the Jachertz digitization mis-spells the abbreviation as 'Ar4g' instead of the pdf's 'Arg4' .

@funderburkjim So there are 287 cases to be checked in Jachertz?

funderburkjim commented 8 years ago

@gasyoun The 287 cases are the unresolved literary source abbreviations in PWK.

One of the sources we might use to resolve these cases is Jachertz.

It might be that the references in MW will also help us to resolve the PWK unknowns. That is why the MW references are interspersed in the bibnew_disp2_edit.txt. For instance, the two unresolved cases 'ANUPADA' and 'ANUPADAS' are likely to be 'anupada-sūtra ' which appears as an MW literary source.

gasyoun commented 8 years ago

Thanks. Just today had a fight. One man told that there are no unknown abbreviations in PW(G-K). I had to laugh.

funderburkjim commented 8 years ago

The man can easily prove us wrong by filling in the blanks in bibnew_disp2_edit.txt :)