sanskrit-lexicon / PWG

Boehtlingk und Roth Sanskrit Wörterbuch, 7 Bände Petersburg 1855-1875
0 stars 0 forks source link

HarivaMSa links via MBH Calcutta #49

Open funderburkjim opened 2 years ago

funderburkjim commented 2 years ago

Since the Mahabharata Calcutta edition contains the HarivaMSa as a 19th parvan, it seems likely that PWG links to HarivaMsa can be resolved in a manner similar to that used for references to the first 18 parva (#48).

The last page of mbhcalc for 18th parvan is 443. Page 444 is a blank page.

The pages of mbhcalc devoted to Harivamsha are from internal page number 445 image

thru internal page number 1007 image

The

Andhrabharati commented 2 years ago

I would suggest using the separately printed Harivamsa volume, instead of this continuation in Vol. 4 of MBh. [Guess no need to give the reason (which indeed does exist)!]

https://github.com/sanskrit-lexicon/COLOGNE/issues/371#issuecomment-971753271

funderburkjim commented 2 years ago

Reality check

11263 matches in 11255 lines for "<ls>HARIV" in buffer: pwg.txt and another 350 or so HARIV improperly marked.

One instance is <ls>HARIV. 708.</ls> occurring under headword akfSASva.

If verse 1 occurs on page 445 and there are about 30 verses per page, then verse 708 should occur about 708/30 = 24 pages after 445, or on page 469. And in fact verse 708 does occur on page 469: (478 is the external page number in volume 4 pdf). mbh_calc_4 478.pdf

and we see that our word akfSASva अकृशाश्व is found in the line for verse 708:

image

Based on this example, we can conclude that the verse references in mbhcalc for parvan 19 are consistent with the pwg verse references with HARIV.

So, mbhcalc can serve as link target for source HARIV in pwg.

funderburkjim commented 2 years ago

@Andhrabharati Yes, please give your reason for preferring another edition. At this stage, it would be easier to use mbh-calcutta, so would want switching to another print version to be worth the trouble.

Also, have you checked the verse-numbering compatibility with pwg?

funderburkjim commented 2 years ago

The Other Haribansa

Have downloaded and looked at the other version suggested above.
To my eye, this 'haribansa' pdf looks virtually identical to the 19th parvan in mbhcalc. The biggest difference noticed is the internal page number. For comparison, here is the page from haribansa containing the verse 708. It can be compared to the pdf mbh_calc_4 478.pdf above.

haribansa 36.pdf

gasyoun commented 2 years ago

[Guess no need to give the reason (which indeed does exist)!]

Guessed wrongly.

To my eye, this 'haribansa' pdf looks virtually identical to the 19th parvan in mbhcalc.

If such is the case, why bother @Andhrabharati ?

funderburkjim commented 2 years ago

Since the two pdfs are so similar, I can base instructions for @KateRusse on the haribansa version preferred by @Andhrabharati, and will thus proceed. Instructions will be developed soon.

Andhrabharati commented 2 years ago

Whatever my reason is, the PWG refers to the 4th Vol. of the Calc. ed. MBh. itself!

¤Hariv. = ¤Harivam̃śa im 4ten Bande des ¤MBh. Aus diesem Werke haben wir die Nomina propria mit Benutzung des Index in der ¤Langlois'-schen Uebersetzung (¤Gild. Bibl. 122) aufgenommen.

And the pwk mentions thus- ¤Hariv. = ¤Harivam̃śa. Mit einer Zahl die ältere Calc. Ausg. gemeint, mit drei Zahlen die neuere lithographirte. [The Lithographed ed. is nothing but the Bomb. ed.]

gasyoun commented 2 years ago

[The Lithographed ed. is nothing but the Bomb. ed.]

Wow, what a research. Yes, please advise @katerusse.

funderburkjim commented 2 years ago

@Andhrabharati Is your position now that we should use images from the 4th volume of MBHcalc?

Andhrabharati commented 2 years ago

No, I was just mentioning what PWG said.

I still go with my original suggestion, which spans all across the Sanskrit Literature.

funderburkjim commented 2 years ago

OK. But I am still curious why you prefer the separate Haribansa pdfs.

One difference I can see is that the size of each pdf page of Haribans is about 1MB, whereas the size of each pdf page in 4th volume of mbhcalc is about half that (0.5MB).

gasyoun commented 2 years ago

Haribans is about 1MB

Size is not an issue and does not speak about the quality of the scan as well. It's only the level of compression.

funderburkjim commented 2 years ago

index instructions: get pdf

@gasyoun @KateRusse Here are instructions for creating an index of the pages in Harivansa. This follows the model of the Index file created by @Andhrabharati for the Mahabharata calcutta edition.

Get pdf, option 1

Get Haribansa download at https://opacplus.bsb-muenchen.de/Vta2/bsb10219661/bsb:BV001652965?page=11

This reference gives screenshots that will be helpful in actually getting the pdf downloaded. The size is about 600MB.

Then you can view the pdf locally with a browser or your favorite pdf viewer.

Get pdf, option 2

You can view the pdf pages one at a time in the browser.

The individual pages have been uploaded to a repository: https://github.com/sanskrit-lexicon-scans/hariv. The pdfs of each page are in the pdfpages directory. If you click on one, it will be displayed. For example, page 1 comes up at url https://github.com/sanskrit-lexicon-scans/hariv/blob/main/pdfpages/hariv_001.pdf.

If neither of above work for you, we'll find another way in comments in this issue.

funderburkjim commented 2 years ago

Index instructions: the index file

The main task is to create a table of information, with one line in the table for each page of the pdf, from page 1 to page 563. The format of this table will be the similar to the format used for the Mahabharata calcutta edition; here is mbhcalindex. The difference is that there is no need for the Vol. (Volume) and Parva columns in Haribansa index. So the hariv_index file you create will have columns

There are a few subtleties in deciding what the Start, End, and Count fields should be. Once you get started, we can discuss questions as they arise.

Your hariv_index file can be created as a text file or a spreadsheet file. If a text file, then separate the fields either with a tab character or with a colon character.

Example of page 3

page 3 pdf page = 3 start = 49 end = 77 count = 29

funderburkjim commented 2 years ago

Not every line is a verse

In the page 3 example, the body of the page (i.e., excluding the top line containing the page number) actually has 30 lines. But the 6th line (the one ending ॥ १ ॥) is not counted as a verse. That's why 'Count = 3' for page 3 index.

Question for @Andhrabharati : What is such a non-verse line?

Possible 'verse gaps'

In mbhcalc, there were several pages where there was a 'gap' in the verse numbering. If you notice such a gap in a Harivansa page, please note this in a comment.

@KateRusse When you've done the index for the first few pages, upload your hariv_index file so I can review.

Andhrabharati commented 2 years ago

Question for @Andhrabharati : What is such a non-verse line?

They are called 'colophones' and considered unanimously by all literati, to be not a 'part' of the main text.

Andhrabharati commented 2 years ago

Like to see how long KateRusse would take, to finish the task.

(I had done it in just about two hours.)

gasyoun commented 2 years ago

(I had done it in just about two hours.)

What do you mean done? She can do harder tasks, if this one is done, no need to redo, as there are no other tasks, requiring these skills @Andhrabharati

funderburkjim commented 2 years ago

@KateRusse I hope you will undertake this indexing. Please let us know your intention in this regard.

KateRusse commented 2 years ago

@KateRusse I hope you will undertake this indexing. Please let us know your intention in this regard.

Is there anything left to do? I can continue this work

funderburkjim commented 2 years ago

@KateRusse I do not have an index for Harivansa. So construction of that index remains to be done.

KateRusse commented 2 years ago

I have done an index for the first 20 pages. If everything is alright, I can continue. Harivansha-1.txt

gasyoun commented 2 years ago

I have done an index for the first 20 pages

Perfect. I've sent you a piece of software for recording of how you do it, thanks.

funderburkjim commented 2 years ago

@KateRusse I spot-checked several of the first 20 lines, and everything looks fine! Ok to proceed.

KateRusse commented 2 years ago

Here is an index of 150 pages.

Harivansha-1.txt

gasyoun commented 2 years ago

59 1697 1725 29 After the verse 1713 the line is not a verse, the given numeration goes wrong from this place.

and

144 4195 4224 30 One more mistake in the given numeration

@Andhrabharati agree?

funderburkjim commented 2 years ago

The display now available for the first 150 pages. Example: https://sanskrit-lexicon-scans.github.io/hariv/?4224

funderburkjim commented 2 years ago

@KateRusse I agree there is a verse gap on page 144, but believe the First and Last verses needs to be changed (request you to make change in the next file you post). I identify the verse gap as 1425-1429.

OLD:
144 4195    4224    30  One more mistake in the given numeration
145 4225    4253    29

NEW:
144 4195    4230    30  One more mistake in the given numeration
145 4231    4253    29

Compare link above that goes to page 144, and Example: https://sanskrit-lexicon-scans.github.io/hariv/?4231 which goes to page 145. Note this change has been made in display.

funderburkjim commented 2 years ago

Re. verse 1713

https://sanskrit-lexicon-scans.github.io/hariv/?1713

There are 4 verses between the labeled 1710 and 1715, but the 3rd intermediate verse has the appearance of a 'colophone numbered 31'
So there does appear to be a verse gap with 1 verse missing, either 1713 or 1714.

Andhrabharati commented 2 years ago

The Calc. ed. has only a single 'danda' for the verse endings; the double 'danda' is always used for the colophone ending.

This may be taken as a clue; and those lines are not to be treated as verses.

Incidentally, @KateRusse has skipped many such in the first 150 pages, but somehow marked only these two places!!

Every adhyAya and every parva (HV has two parvas) & any upaparva (if any) will have its own colophone. So it may be guessed how many of them exist in these 150 pages.

gasyoun commented 2 years ago

The Calc. ed. has only a single 'danda' for the verse endings; the double 'danda' is always used for the colophone ending.

Thanks for making us aware of that fact.

Incidentally, @KateRusse has skipped many such in the first 150 pages, but somehow marked only these two places!!

Can you give those skipped ones, please?

Every adhyAya and every parva (HV has two parvas) & any upaparva (if any) will have its own colophone.

So how many total?

Andhrabharati commented 2 years ago

Incidentally, @KateRusse has skipped many such in the first 150 pages, but somehow marked only these two places!!

Can you give those skipped ones, please?

@KateRusse I agree there is a verse gap on page 144, but believe the First and Last verses needs to be changed (request you to make change in the next file you post).

I meant the colophones; now I see that she was just (re)marking the places where the jumps in verse numbers are seen. She did identify them properly at these two places, but has taken a wrong step in marking all the following 'start' & 'end' numbers as per theory (continuous running numbers), instead of giving the book numbers, starting from p.59. This would lead to wrong page linking for many verses esp. at those page cross-overs.

I identify the verse gap as 1425-1429.

I marked the gap as 4221-4225 in my file, as 4228 citation is in PWG.

Every adhyAya and every parva (HV has two parvas) & any upaparva (if any) will have its own colophone.

So how many total?

The last page of Harivamsa says there are 326 adhyAyas.

Andhrabharati commented 2 years ago

So there does appear to be a verse gap with 1 verse missing, either 1713 or 1714.

It is 1713 that is to be considered missing, as 1714 citation is in PWG (as I had marked in my file).

With these two examples, one can safely consider that the 'old' series (lesser number) ends before the colophone, and the 'new' series (higher number) starts after the colophone, in case of a jump at that juncture.

Andhrabharati commented 2 years ago

(I had done it in just about two hours.)

What do you mean done? She can do harder tasks, if this one is done, no need to redo, as there are no other tasks, requiring these skills @Andhrabharati

You may look at my post https://github.com/sanskrit-lexicon/PWG/issues/48#issuecomment-1020755849 reg. the same.

I thought getting more people involved in the task might be beneficial, hence waiting for others to groom-up! [Of course, I don't have a least doubt that there would be ANYONE matching me in speed or understanding things.]

There is no shortage for the pdf-linkable targets across the CDSL dictionaries; so more the 'skilled people', faster would be the work done.

Speaking of this, Jim might probably consider making a count of ls citations by "work name", like he has made a comparative list of verb (dhAtu) occurrences across the dictionaries, and every work occurring more than 5000 times (or may even be 3000) could be considered a worthy pdf-linkable target.

KateRusse commented 2 years ago

Should I give numeration of verses according to the book or to their real order? Or should I make one more column for the given numeration?

Andhrabharati commented 2 years ago

The verse numbers should be according to the book; as the exercise is to generate links for the citations to the book verses, and not to correct the errors in the book numbers.

Noting the gaps (jumps) in the book numbers is just an additional academic exercise.

Even the verse count in each page is used just to identify the jumps, (while I was doing it); it is really not required for indexing, which just needs the beginning and ending verses in each page.

And I was dynamically computing the same [simple difference value], as I was working in Excel, not manually counting them (which would take too much of a time); that's how I could do it so quickly!

KateRusse commented 2 years ago

I have created a new file according to the book numeration. First 250 pages are done. Harivansha-2.txt

Andhrabharati commented 2 years ago

@KateRusse Just to give you an example, your file has v. 1756 as the starting verse in p. 61; thus when Jim makes the linking active, the entry words "jahnu" & "nIla" (both in SLP1) in PWG link to the p. 61 where the verse 1756 containing jahnu (or nIla) cannot be seen at all (it actually being in the prev. page). So, this should be marked as the ending verse of p. 60. Hope, you understand the necessity now.

It is the responsibility of us, the humans, to give correct data to the computer programs to work correctly; they just act as per the data provided to them. [of course the AI is a different field altogether, and none here are into it, I guess!] ---------------------- BTW, @funderburkjim, I've just seen that the jahnu entry in PWG has NO link to the "Mbh. 1,3722. fgg.", but the following verses 12,1717. 13,202. 13,7680 are properly linked.

So you still need to work on some more MBh. links; I'm sure you would look for other types of such pending combinations with just this clue.

KateRusse commented 2 years ago

@KateRusse Just to give you an example, your file has v. 1756 as the starting verse in p. 61; thus when Jim makes the linking active, the entry words "jahnu" & "nIla" (both in SLP1) in PWG link to the p. 61 where the verse 1756 containing jahnu (or nIla) cannot be seen at all (it actually being in the prev. page). So, this should be marked as the ending verse of p. 60. Hope, you understand the necessity now.

Please look through my new file, this mistake is already corrected there.

Andhrabharati commented 2 years ago

Yes, seen it already.

I was typing my above message, while you had updated your file and posted.

So my message actually is addressing to your Harivansha-1 file, not the revised Harivansha-2 file.

KateRusse commented 2 years ago

I have corrected it one more time: Harivansha-2.txt

gasyoun commented 2 years ago

[Of course, I don't have a least doubt that there would be ANYONE matching me in speed or understanding things.]

Yes, we can't beat you.

There is no shortage for the pdf-linkable targets across the CDSL dictionaries; so more the 'skilled people', faster would be the work done.

Yes, for years to come.

Speaking of this, Jim might probably consider making a count of ls citations by "work name", like he has made a comparative list of verb (dhAtu) occurrences across the dictionaries, and every work occurring more than 5000 times (or may even be 3000) could be considered a worthy pdf-linkable target.

There was already such a list and the biggest link targets soon will be closed.

[of course the AI is a different field altogether, and none here are into it, I guess!]

Wrong guessing again - into AI since 1999.

Andhrabharati commented 2 years ago

Speaking of this, Jim might probably consider making a count of ls citations by "work name", like he has made a comparative list of verb (dhAtu) occurrences across the dictionaries, and every work occurring more than 5000 times (or may even be 3000) could be considered a worthy pdf-linkable target.

There was already such a list and the biggest link targets soon will be closed.

@gasyoun could you get me the link to this list, so that I may help identifying the 'sources' to link?

funderburkjim commented 2 years ago

Improvements made to PWG HARIV links

This work done in pwg_ls2/hariv folder. Before the improvements, 9639 well-formed links to Harivamsa Calcutta edition were present in PWG. At the end of the changes, 15595 such well-formed links were present.

Also, 26 links were identified as abnormal (see file change_abnormal.txt). The changes made to markup appear in files change_01.txt and change_02.txt.

funderburkjim commented 2 years ago

display program revision

The display program component (basicadjust.php) has been adjusted to provide active links to Harivamsa (see revisions to csl-websanlexicon and csl-apidev above).

The link target is currently https://sanskrit-lexicon-scans.github.io/hariv/.

These links are present for PWG, PW (with literary source abbreviation HARIV. and for MW (with abbreviation Hariv.).

funderburkjim commented 2 years ago

You can confirm the HARIV links from these dictionary entries:

https://sanskrit-lexicon.uni-koeln.de/simple/pwg/aMSa

https://sanskrit-lexicon.uni-koeln.de/simple/pw/renu

https://sanskrit-lexicon.uni-koeln.de/simple/mw/galita

Andhrabharati commented 2 years ago

Here is the "resolved" Hariv. abnormal cases file, for perusal- PWG Hariv. abnormal cases.txt

My file has <1331. 5185. 10995> at the BAsvant entry, which would be properly resolved as a link.

On as second thought, the S. xxx citations could be linked as https://sanskrit-lexicon-scans.github.io/hariv/?xxx -- is this possible to do?

gasyoun commented 2 years ago

dsadasdsasad

gen. MBh. xv, 463 [C] inf. cyavitum), Mn. vii, 98 ; MBh. iii ; both are still missing @funderburkjim

KateRusse commented 2 years ago

400 pages Harivansha-2.txt

Andhrabharati commented 2 years ago

@KateRusse Did you observe that 110xx block of verses is repeated at two places-- pp. 345-8 and pp. 376-9? You need to mark them somehow, so that Jim would pay attention to it; otherwise the program may give wrong result, or even hang-up