sanskrit-lexicon / PWG

Boehtlingk und Roth Sanskrit Wörterbuch, 7 Bände Petersburg 1855-1875
0 stars 0 forks source link

Links to MBH Calcutta edition #48

Closed funderburkjim closed 2 years ago

funderburkjim commented 2 years ago

The aim here is to prepare links to pdf pages for the MBH literary source references in pwg dictionary.

We hope to make use of material developed by @Andhrabharati as described here.
The material includes

From these, for a given reference in pwg, we should be able to provide a link to a pdf of the MBH page.

funderburkjim commented 2 years ago

reality check.

The first MBH reference in pwg is <ls>MBH. 1, 2523.</ls> under headword aMSa. The relevant line of index file is 1 92 I 2513 2541 29 which says in volume 1, page 92 contains parvan 'I' (= 1) verses from 2513 to 2541 (29 verses).

And here is a snip of the relevant page: image

And indeed the word aMSa occurs where expected. Hurray!

Second example.

pwg reference <ls>MBH. 18, 157.</ls> under headword agADa (अगाध).

The index shows 4 437 XVIII 138 166 29, indicating that verse 157 of 18th parvan occurs on page 437 of volume 4. And indeed we do find अगाध where expected: image

ok to proceed

We are ok to go in preparing the pdf pages and appropriate hosting page for the links. From 28747 matches in 28741 lines for "<ls>MBH. [0-9]+, [0-9]+.</ls>" in buffer: pwg.txt, all of these should resolve into links when the necessary infrastructure is developed.

gasyoun commented 2 years ago

all of these should resolve into links when the necessary infrastructure is developed

Hurray! Hope Harivamsa follows and Ramayana is waiting in the line. That's amazing to see it come alive.

funderburkjim commented 2 years ago

@Andhrabharati
You mention some mistakes in the mbh_calc scans (Reference)

The other two noted mistakes to be discussed in separate comments below.

Andhrabharati commented 2 years ago
* In vol.3, p.727 is badly cut at the left side.

  * confirmed.  Do you have a replacement from some other source?

you may take this- https://archive.org/details/in.ernet.dli.2015.326584/page/n743/mode/1up

And you would get the missing pp. 498-9 as well from this archive book.

Andhrabharati commented 2 years ago

Hurray! Hope Harivamsa follows and Ramayana is waiting in the line. That's amazing to see it come alive.

@gasyoun

Now that a format is (kind of) known, you may put your 'new' Russian 'team' on Harivamsa and Ramayana indexing. [Let others also learn doing small works like these; of course, I had already done the Harivamsa after finishing the MBh indexing last month itself.]

Andhrabharati commented 2 years ago

From 28747 matches in 28741 lines for "<ls>MBH. [0-9]+, [0-9]+.</ls>" in buffer: pwg.txt,

I have 1659 occurrences of <ls>MBH\.</ls> <[0-9]+,[0-9]+> in the VN pages and 28825 in the Main pages, in my 'worked' file.

And there are ~7800 series of MBH. x1,y1. x2,y2. occurrences in the main text and ~100 in the VN pages that need to be separately linked.

funderburkjim commented 2 years ago

Vana Parva verses on p. 517

The above reference notes: In vol.1, p.517 v.3100 onward is numbered as v.4000 onward.. The relevant portion of the index:

1   515 III 3030    3059    30
1   516 III 3060    3087    28
1   517 III 3088    3114    27   <<<<  3088 - 3099, 4000 - 4014
1   518 III 4015    4042    28

image

consistency with PWG

<s>upAkftya samAhftya</s> <ls>MBH. 3, 3098.</ls> under kar And consistent with page 517 3088 - 3099 (see previous image)

<s>mOrvIkftakiRO (BujO)</s> <ls>MBH. 3, 4008.</ls> under kiRa. And consistent with page 517, verses 4000 - 4014.

image

These examples suggest that the verse numbering in PWG (Vana Parva) is consistent with that in this mbh_calc_1.pdf. But see next for likely exception.

funderburkjim commented 2 years ago

Three variances to above

3 matches for "<ls>MBH. [3], 3[1-9][0-9]\{2\}.</ls>" in buffer: pwg.txt These verses (in 3rd parvan) would be in the range 3100-3999. By the above, there should be none. So these are a puzzle. One turns out to be a typo. The other two are unresolved.

Under ftu, p. 1-1053, 11 lines from bottom <ls>MBH. 3, 3402.</ls> {#f\tuH svABAvikaH strIRAM rAtrayaH zoqaSa#}

Under garhaRIya, p. 2-0708, 8 lines from bottom <ls>MBH. 3, 3888.</ls> {#garhaRIyAnyaTA Bavet#}

Under x, p. 6-0401 <ls>MBH. 3, 3995.</ls> This is a typo: should be 9995.

Andhrabharati commented 2 years ago

Just like to tell that the excel file has one additional column with some details, as Remarks.

This column is not copied in the txt file.

[And I seem to have done few 'unwanted' corrections in the 'To' verse numbers (trying to 'correct' them properly); all such can be reverted to the 'book numbers' looking at the Remarks column, or might even be completely ignored (as they can be derived from the next page starting verse number).]

Andhrabharati commented 2 years ago

These verses (in 3rd parvan) would be in the range 3100-3999. By the above, there should be none. So these are a puzzle. One turns out to be a typo. The other two are unresolved.

Under ftu, p. 1-1053, 11 lines from bottom <ls>MBH. 3, 3402.</ls> {#f\tuH svABAvikaH strIRAM rAtrayaH zoqaSa#}

This is a printo for 1,3402. And the quoted text does not refer to MBH, but to M. 3,46.

Under garhaRIya, p. 2-0708, 8 lines from bottom <ls>MBH. 3, 3888.</ls> {#garhaRIyAnyaTA Bavet#}

This is a printo for 5,3888.

gasyoun commented 2 years ago

printo

print error?

I had already done the Harivamsa after finishing the MBh indexing last month itself.

So Ramayana left?

Andhrabharati commented 2 years ago

Just did a quick search, to see if there are any more in the range MBH. 3,3100-3999; and found that <L>44247 (pAtAla) has two such.

<ls>MBH.</ls> <3,3547>. fgg. <3552>

Andhrabharati commented 2 years ago

And here are the pages, where some 'jumpimg' of numbering is seen and marked in my excel file-

1.517   III,3088-4014   27  31xx > 40xx : 900
1.520   III,4073-5002   30  41xx > 50xx : 900
1.524   III,5092-6021   30  51xx > 60xx : 900
1.527   III,6082-7011   30  61xx > 70xx : 900
1.530   III,7071-8099   29  71xx > 80xx : 900
1.561   III,8870-9999   30  89xx > 99xx : 1000
1.569   III,10086-11114 29  101xx > 110xx : 900
1.572   III,11071-10294 24  111xx > 102xx : -900
2.36    IV,958-986  29  94x > 95x :10
2.84    IV,2323-2360    28  233x > 234x : 10
2.124   V,990-1019  30  995 > 990 : -5
2.149   V,1578-1598 21  1579 >1578 : -1
2.307   V,6083-7012 30  61xx > 70xx : 900
2.480   VI,4278-4312    30  4286 > 4290 : 5
2.657   VII,3493-3527   30  3505 > 3510 :10
3.34    VIII,927-956    30  928 > 928 : -1
3.250   IX,2015-2049    29  2035 > 2040 : 5
3.374   ΧΙΙ,225-253 29  221 > 225 : 4

The data is slightly reformatted, for quick comprehension.

Andhrabharati commented 2 years ago

printo

print error?

I remember seeing this 'printo' used somewhere long back.

As it rhymed with the popular 'typo' (typographical error in typed documents), I've been using this 'printo' as a short form for 'print error' (for errors in printed books) since then.

funderburkjim commented 2 years ago

Regarding the other 'jumps', I also made a list of 'verse gaps', based on your pdf remarks: I haven't tried to identify pwg instances of the other verse gaps except that at vol 1, p. 517

gaps in verses. From MBh.Calc.ed.index.Vols.1-4.xlsx

1   517 III 3088    3114    27
<<<<  3088 - 3099, 4000 - 4014  verse gap: 3100 - 3999
1   518 III 4015    4042    28

1   520 III 4073    4102    30
<<<< 4073 - 4099, 5000 - 5002   verse gap: 4100 - 4999
1   521 III 5003    5032    30

1   524 III 5092    5121    30
<<<< 5092 - 5099, 6000 - 6021   verse gap: 5100 - 5999
1   525 III 6022    6051    30

1   527 III 6082    6111    30
<<<< 6082 - 6099, 7000 - 7011   verse gap: 6100 - 6999
1   528 III 7012    7041    30

1   530 III 7071    7099    29
<<<< 7071 - 7099, ---           verse gap: 7100 - 7999
1   531 III 8000    8029    30

1   561 III 8870    8899    30
<<<< 8870 - 8899                verse gap: 8900 - 9899
1   562 III 9900    9927    28

2   35  IV  919 947 29
<<<< 919 - 947,                 verse gap: 948 - 957
2   36  IV  958 986 29

2   83  IV  2293    2322    30
Correction: 2322 -> 2321
2   84  IV  2323    2360    28
Correction: 2323 -> 2322
233x > 234x : 10
Verse gap: 2330 - 2339

2   148 V   1556    1578    23
2   149 V   1578    1598    21
1579 >1578 : -1
Page 148 ends with 1575 and 3 more verses, thus 1575-1578,
Page 149 begins with 2 verses then 1580, thus 1578-1580.
So there is confusion over verse 1578.

2   307 V   6083    7012    30
61xx > 70xx : 900
<<<< 6083 - 6099, 7000 - 7012   verse gap: 6100 - 6999

2   480 VI  4278    4312    30
4286 > 4290 : 5
<<<< 4278 - 4285, 4290 - 4312.  verse gap: 4286 - 4289

2   657 VII 3493    3527    30
(+ 3493 30) = 3523
3505 > 3510 :10
<<<< 3493 - 3504, 3510 - 3527  verse gap: 3505 - 3509

3   250 IX  2015    2049    29
2035 > 2040 : 5
<<<< 2015 - 2034, 2040 - 2049.   verse gap 2035-2039
funderburkjim commented 2 years ago

Page changes

Added 3.498, 3.499 from archive.org version. Replaced 3.727 from archive.org version.

Also corrected a file-name error in pdf page renaming.

mv pdfpages/mbhcalc_3.502.pdf pdfpages/mbhcalc_3.500.pdf
 . . .
mv pdfpages/mbhcalc_3.861.pdf pdfpages/mbhcalc_3.859.pdf
funderburkjim commented 2 years ago

Links now active for PWG

Example: https://sanskrit-lexicon.uni-koeln.de/simple/pwg/guru image

And clicking on the link: image

Andhrabharati commented 2 years ago

glad to see this.

you may link the pwk citations also, alongwith mw.

funderburkjim commented 2 years ago

implementation

The url form of the display is https://sanskrit-lexicon-scans.github.io/mbhcalc/?p.v where

This displays the pdf of the page containing the given verse for the given parvan.

All the 3000+ page pdfs are in the repository https://github.com/sanskrit-lexicon-scans/mbhcalc. The display is generated by the web page index.html.

funderburkjim commented 2 years ago

MW citations

There are at least some MW citations of the form <ls>MBh. q,v</ls>. (One such is under 'guru') Where 'q' is a roman numeral for the parvan. Such MW citations also have active links in the displays.

funderburkjim commented 2 years ago

PW(K) citations

AFAIK, the PWK MBH citations are to the BOMBAY edition of Mahabharata. They have the form 'MBH. x,y,z' (Three numbers). These are NOT resolved as links to Calcutta edition.

Not sure if there other PWK citations to Calcutta edition of MBH.

funderburkjim commented 2 years ago

known work to do

I'll open other issues for the 2nd and 3rd items. Hope @gasyoun crew will be able to help with the verse-page index for Harivamsa. Probably best for me to do the 3rd item.

KateRusse commented 2 years ago

reality check.

The first MBH reference in pwg is <ls>MBH. 1, 2523.</ls> under headword aMSa. The relevant line of index file is 1 92 I 2513 2541 29 which says in volume 1, page 92 contains parvan 'I' (= 1) verses from 2513 to 2541 (29 verses).

Hello! Thank you, but could you please explain how the number of the parvan is established?

funderburkjim commented 2 years ago

parvan-number correspondence

1 Ādiparva
2 Sabhāparva
3 Vanaparva
4 Virāṭparva
5 Udyogaparva
6 Bhīṣmaparva
7 Droṇaparva
8 Karṇaparva
9 Śalyaparva
10 Sauptikaparva
11 Strīparva
12 Śāntiparva
13 Anuśāsanaparva
14 Āśvamedhikaparva
15 Āśramavāsikamparva
16 Mausalaparva
17 Mahāprasthānikaparva
18 Svargārohanikaparva

We are using a 4-volume pdf for this Calcutta edition. The parvan names and numbers can be seen on title page.

Note that in the 4th volume, there is a 19th parvan for Harivansha.

Note that this numbering is consistent with https://en.wikipedia.org/wiki/Mahabharata.

@KateRusse Do these comments answer your question?

KateRusse commented 2 years ago

parvan-number correspondence

1 Ādiparva
2 Sabhāparva
3 Vanaparva
4 Virāṭparva
5 Udyogaparva
6 Bhīṣmaparva
7 Droṇaparva
8 Karṇaparva
9 Śalyaparva
10 Sauptikaparva
11 Strīparva
12 Śāntiparva
13 Anuśāsanaparva
14 Āśvamedhikaparva
15 Āśramavāsikamparva
16 Mausalaparva
17 Mahāprasthānikaparva
18 Svargārohanikaparva

We are using a 4-volume pdf for this Calcutta edition. The parvan names and numbers can be seen on title page.

* [volume 1](https://github.com/sanskrit-lexicon-scans/mbhcalc/blob/main/pdfpages/mbhcalc_1.000a.pdf)  ([alternate link to volume 1 titlepage](https://sanskrit-lexicon-scans.github.io/mbhcalc/pdfpages/mbhcalc_1.000a.pdf))

* [volume 2](https://github.com/sanskrit-lexicon-scans/mbhcalc/blob/main/pdfpages/mbhcalc_2.000a.pdf)

* [volume 3](https://github.com/sanskrit-lexicon-scans/mbhcalc/blob/main/pdfpages/mbhcalc_3.000a.pdf)

* [volume 4](https://github.com/sanskrit-lexicon-scans/mbhcalc/blob/main/pdfpages/mbhcalc_4.000a.pdf)

Note that in the 4th volume, there is a 19th parvan for Harivansha.

Note that this numbering is consistent with https://en.wikipedia.org/wiki/Mahabharata.

@KateRusse Do these comments answer your question?

Yes, thank you very much!

gasyoun commented 2 years ago

And here are the pages, where some 'jumpimg' of numbering is seen and marked in my excel file-

In the original printed book?

I remember seeing this 'printo' used somewhere long back.

Must be Indian English ))

Example: https://sanskrit-lexicon.uni-koeln.de/simple/pwg/guru

Hurray, the Mahabharata links are working! @drdhaval2785 what about the highlighter script that show us the line we actually search for?

may link the pwk citations also, alongwith mw.

Yes, hope we are heading there.

AFAIK, the PWK MBH citations are to the BOMBAY edition of Mahabharata.

@Andhrabharati ?

https://sanskrit-lexicon-scans.github.io/mbhcalc/?p.v

I would propose we get rid of the ? with an Apache rewrite?

crew will be able to help with the verse-page index for Harivamsa

Sure, let @KateRusse take an eye on it. Do you understand what is to be done?

funderburkjim commented 2 years ago

get rid of the ? with an Apache rewrite?

This could be done on the Cologne server, but the current hosting on Github does not support Apache rewrites.

funderburkjim commented 2 years ago

let @KateRusse take an eye on it.

Great! I'll explain what to do in next issue (#49).

Andhrabharati commented 2 years ago

And here are the pages, where some 'jumpimg' of numbering is seen and marked in my excel file-

In the original printed book?

Yes, and it was Hermann Jacobi who has first commented on the point way back in 1903; explaining the total verse number difference in the 3rd parva wrt other editions, counting to over 5000.

The same (count difference) has also been mentioned by the BORI editor in the critical edition.

I remember seeing this 'printo' used somewhere long back.

Must be Indian English ))

could be, as I started looking at "foreign" books only recently.

And this is not listed anywhere 'publicly'.

Example: https://sanskrit-lexicon.uni-koeln.de/simple/pwg/guru

Hurray, the Mahabharata links are working! @drdhaval2785 what about the highlighter script that show us the line we actually search for?

That script works only on the digital texts, not on images.

may link the pwk citations also, alongwith mw.

Yes, hope we are heading there.

There are only a few in pwk, as @funderburkjim correctly noticed. My point was to link them in the same spree.

AFAIK, the PWK MBH citations are to the BOMBAY edition of Mahabharata.

@Andhrabharati ?

Nothing to differ; the pwk biblio entry (vol. 1) itself clearly mentions it. [¤MBh. = ¤Mahābhārata, citirt nach Parvan, Adhyāya und Śloka der Bomb. Ausg. Die ältere Calcuttaer Ausg. mit zwei Zahlen wird nur dann angeführt, wenn sie eine abweichende Lesart bietet.]

And once the Bomb. ed. got published, attention was shifted to it, as it contained a short (?) commentary as well by Nilakantha.

It won't be out of context to note that MW has heavily picked up all his citations from pwg and hence it has majority of Calc. ed. MBh.

funderburkjim commented 2 years ago

There are only a few in pwk,

@Andhrabharati how do we recognize these? Would you give an example where PWK references the Calcutta edition?

Andhrabharati commented 2 years ago

PW(K) citations

AFAIK, the PWK MBH citations are to the BOMBAY edition of Mahabharata. They have the form 'MBH. x,y,z' (Three numbers). These are NOT resolved as links to Calcutta edition.

Not sure if there other PWK citations to Calcutta edition of MBH.

Even you yourself had the clue, @funderburkjim !

There are 346 places without a dot ending x,y and 137 places with a dot ending x,y., as against 198 of x,y,z and 2327 of x,y,z.

[pl. look at the pw_AB_08.txt file made by you from my file, in November.]

funderburkjim commented 2 years ago

cleanup of pwg MBH instances

The work mentioned in a previous comment .

funderburkjim commented 2 years ago

The cleanup work done in mwg_ls2/mbh directory.

Before the work, there were found 29000+ MBH verses with active links and 8700+ malformed MBH references. After the changes, there are now found 55000+ verses with active links, and 27 known malformed links (these are listed in the readme). About 9000 lines of pwg.txt were changed (less than 1% of the current lines).

The table of 'NORMAL' link types in the readme file shows what are considered as 'normal' ls references to MBH.

Next related improvements to do:

Andhrabharati commented 2 years ago

great; so this largest cited MBh. in PWG can now be taken as fully linked.

and let me see what those 27 malformed ones are, mentioned to be in the readme.

Andhrabharati commented 2 years ago

Here is the file to check and act on the 27 cases- PWG MBh. abnormal cases.txt

And probably some of these could be linked to the resp. pages of the particular volumes, even if they are not pointing to any text verses. If decided to do so and want my help, they (page numbers) could be provided in no time.

gasyoun commented 2 years ago

not pointing to any text verses

Yes, they should be linked as well.

they (page numbers) could be provided in no time.

I ask, not sure if that's enough.

Andhrabharati commented 2 years ago

Need Jim's option, not yours @gasyoun !

funderburkjim commented 2 years ago

PWK Calc. ed. links resolved

As noted above, the links to Calcutta edition of Mahabharata in PWK are identify as, for example <ls>MBH. 6,352.</ls>, and are already handled properly: cf. https://sanskrit-lexicon.uni-koeln.de/simple/pw/aDivAjyakulAdya

There are roughly 500 such references. By contrast there are 2500+ references to Bombay edition e.g. <ls>MBH. 14,38,2.</ls>> https://sanskrit-lexicon.uni-koeln.de/simple/pw/akArpaRya/

Thus nothing more to do in that regard.

funderburkjim commented 2 years ago

PWG.MBh.abnormal.cases.txt

Only 1 of 27 is identified as 'linkable'.
<ls>MBH. Bd. III, S. 818, Z. 5. u. 4, v. u.</ls> under headword aTarvaSiras is identfied as '3, 12864'

An attempted 'obvious' change (e.g. <ls>MBH.</ls> <ls n="MBH. 3, 12864">Bd. III, S. 818, Z. 5. u. 4, v. u.</ls> was not properly converted to a link in the display program.

Current opinion: Not worth the trouble to modify the display program to handle this isolated case.

funderburkjim commented 2 years ago

Remains to deal with Calcutta edition Hariv. links in PWG, discussed in #49

Andhrabharati commented 2 years ago

PWG.MBh.abnormal.cases.txt

Only 1 of 27 is identified as 'linkable'. <ls>MBH. Bd. III, S. 818, Z. 5. u. 4, v. u.</ls> under headword aTarvaSiras is identfied as '3, 12864'

An attempted 'obvious' change (e.g. <ls>MBH.</ls> <ls n="MBH. 3, 12864">Bd. III, S. 818, Z. 5. u. 4, v. u.</ls> was not properly converted to a link in the display program.

Current opinion: Not worth the trouble to modify the display program to handle this isolated case.

@funderburkjim

Sorry for coming back to a closed issue and posting.

Pl. see my observation as posted at https://github.com/sanskrit-lexicon/PWG/issues/49#issuecomment-1031585162

Just spent few minutes and found ~1500 places where the [a-z].</ls> is at the end, the MBh. link is NOT active.

So I would venture saying that you need to look at these and do some jugglery, instead of leaving as isolated case(s). [The above abnormal case is part of this lot]

And here is the list I got, with a quick regex search- PWG missing MBh links.txt

funderburkjim commented 2 years ago

@Andhrabharati Yes, I've noticed similar MBH misses while working with HARIV. Will focus on missing MBH in a separate issue.