Open funderburkjim opened 2 years ago
I would suggest using the separately printed Harivamsa volume, instead of this continuation in Vol. 4 of MBh. [Guess no need to give the reason (which indeed does exist)!]
https://github.com/sanskrit-lexicon/COLOGNE/issues/371#issuecomment-971753271
11263 matches in 11255 lines for "<ls>HARIV" in buffer: pwg.txt
and another 350 or so HARIV improperly marked.
One instance is <ls>HARIV. 708.</ls>
occurring under headword akfSASva
.
If verse 1 occurs on page 445 and there are about 30 verses per page, then verse 708 should occur about 708/30 = 24 pages after 445, or on page 469. And in fact verse 708 does occur on page 469: (478 is the external page number in volume 4 pdf). mbh_calc_4 478.pdf
and we see that our word akfSASva अकृशाश्व is found in the line for verse 708:
Based on this example, we can conclude that the verse references in mbhcalc for parvan 19 are consistent with the pwg verse references with HARIV.
So, mbhcalc can serve as link target for source HARIV in pwg.
@Andhrabharati Yes, please give your reason for preferring another edition. At this stage, it would be easier to use mbh-calcutta, so would want switching to another print version to be worth the trouble.
Also, have you checked the verse-numbering compatibility with pwg?
Have downloaded and looked at the other version suggested above.
To my eye, this 'haribansa' pdf looks virtually identical to the 19th parvan in mbhcalc. The biggest difference noticed
is the internal page number. For comparison, here is the page from haribansa containing the verse 708. It can be
compared to the pdf mbh_calc_4 478.pdf
above.
[Guess no need to give the reason (which indeed does exist)!]
Guessed wrongly.
To my eye, this 'haribansa' pdf looks virtually identical to the 19th parvan in mbhcalc.
If such is the case, why bother @Andhrabharati ?
Since the two pdfs are so similar, I can base instructions for @KateRusse on the haribansa version preferred by @Andhrabharati, and will thus proceed. Instructions will be developed soon.
Whatever my reason is, the PWG refers to the 4th Vol. of the Calc. ed. MBh. itself!
¤Hariv. = ¤Harivam̃śa im 4ten Bande des ¤MBh. Aus diesem Werke haben wir die Nomina propria mit Benutzung des Index in der ¤Langlois'-schen Uebersetzung (¤Gild. Bibl. 122) aufgenommen.
And the pwk mentions thus- ¤Hariv. = ¤Harivam̃śa. Mit einer Zahl die ältere Calc. Ausg. gemeint, mit drei Zahlen die neuere lithographirte. [The Lithographed ed. is nothing but the Bomb. ed.]
[The Lithographed ed. is nothing but the Bomb. ed.]
Wow, what a research. Yes, please advise @katerusse.
@Andhrabharati Is your position now that we should use images from the 4th volume of MBHcalc?
No, I was just mentioning what PWG said.
I still go with my original suggestion, which spans all across the Sanskrit Literature.
OK. But I am still curious why you prefer the separate Haribansa pdfs.
One difference I can see is that the size of each pdf page of Haribans is about 1MB, whereas the size of each pdf page in 4th volume of mbhcalc is about half that (0.5MB).
Haribans is about 1MB
Size is not an issue and does not speak about the quality of the scan as well. It's only the level of compression.
@gasyoun @KateRusse Here are instructions for creating an index of the pages in Harivansa. This follows the model of the Index file created by @Andhrabharati for the Mahabharata calcutta edition.
Get Haribansa download at https://opacplus.bsb-muenchen.de/Vta2/bsb10219661/bsb:BV001652965?page=11
This reference gives screenshots that will be helpful in actually getting the pdf downloaded. The size is about 600MB.
Then you can view the pdf locally with a browser or your favorite pdf viewer.
You can view the pdf pages one at a time in the browser.
The individual pages have been uploaded to a repository: https://github.com/sanskrit-lexicon-scans/hariv. The pdfs of each page are in the pdfpages directory. If you click on one, it will be displayed. For example, page 1 comes up at url https://github.com/sanskrit-lexicon-scans/hariv/blob/main/pdfpages/hariv_001.pdf.
If neither of above work for you, we'll find another way in comments in this issue.
The main task is to create a table of information, with one line in the table for each page of the pdf, from page 1 to page 563. The format of this table will be the similar to the format used for the Mahabharata calcutta edition; here is mbhcalindex. The difference is that there is no need for the Vol. (Volume) and Parva columns in Haribansa index. So the hariv_index file you create will have columns
There are a few subtleties in deciding what the Start, End, and Count fields should be. Once you get started, we can discuss questions as they arise.
Your hariv_index file can be created as a text file or a spreadsheet file. If a text file, then separate the fields either with a tab character or with a colon character.
page 3 pdf page = 3 start = 49 end = 77 count = 29
In the page 3 example, the body of the page (i.e., excluding the top line containing the page number)
actually has 30 lines. But the 6th line (the one ending ॥ १ ॥
) is not counted as a verse. That's why
'Count = 3' for page 3 index.
Question for @Andhrabharati : What is such a non-verse line?
In mbhcalc, there were several pages where there was a 'gap' in the verse numbering. If you notice such a gap in a Harivansa page, please note this in a comment.
@KateRusse When you've done the index for the first few pages, upload your hariv_index file so I can review.
Question for @Andhrabharati : What is such a non-verse line?
They are called 'colophones' and considered unanimously by all literati, to be not a 'part' of the main text.
Like to see how long KateRusse would take, to finish the task.
(I had done it in just about two hours.)
(I had done it in just about two hours.)
What do you mean done? She can do harder tasks, if this one is done, no need to redo, as there are no other tasks, requiring these skills @Andhrabharati
@KateRusse I hope you will undertake this indexing. Please let us know your intention in this regard.
@KateRusse I hope you will undertake this indexing. Please let us know your intention in this regard.
Is there anything left to do? I can continue this work
@KateRusse I do not have an index for Harivansa. So construction of that index remains to be done.
I have done an index for the first 20 pages. If everything is alright, I can continue. Harivansha-1.txt
I have done an index for the first 20 pages
Perfect. I've sent you a piece of software for recording of how you do it, thanks.
@KateRusse I spot-checked several of the first 20 lines, and everything looks fine! Ok to proceed.
Here is an index of 150 pages.
59 1697 1725 29 After the verse 1713 the line is not a verse, the given numeration goes wrong from this place.
and
144 4195 4224 30 One more mistake in the given numeration
@Andhrabharati agree?
The display now available for the first 150 pages. Example: https://sanskrit-lexicon-scans.github.io/hariv/?4224
@KateRusse I agree there is a verse gap on page 144, but believe the First and Last verses needs to be changed (request you to make change in the next file you post). I identify the verse gap as 1425-1429.
OLD:
144 4195 4224 30 One more mistake in the given numeration
145 4225 4253 29
NEW:
144 4195 4230 30 One more mistake in the given numeration
145 4231 4253 29
Compare link above that goes to page 144, and Example: https://sanskrit-lexicon-scans.github.io/hariv/?4231 which goes to page 145. Note this change has been made in display.
Re. verse 1713
https://sanskrit-lexicon-scans.github.io/hariv/?1713
There are 4 verses between the labeled 1710 and 1715, but the 3rd intermediate verse
has the appearance of a 'colophone numbered 31'
So there does appear to be a verse gap with 1 verse missing, either 1713 or 1714.
The Calc. ed. has only a single 'danda' for the verse endings; the double 'danda' is always used for the colophone ending.
This may be taken as a clue; and those lines are not to be treated as verses.
Incidentally, @KateRusse has skipped many such in the first 150 pages, but somehow marked only these two places!!
Every adhyAya and every parva (HV has two parvas) & any upaparva (if any) will have its own colophone. So it may be guessed how many of them exist in these 150 pages.
The Calc. ed. has only a single 'danda' for the verse endings; the double 'danda' is always used for the colophone ending.
Thanks for making us aware of that fact.
Incidentally, @KateRusse has skipped many such in the first 150 pages, but somehow marked only these two places!!
Can you give those skipped ones, please?
Every adhyAya and every parva (HV has two parvas) & any upaparva (if any) will have its own colophone.
So how many total?
Incidentally, @KateRusse has skipped many such in the first 150 pages, but somehow marked only these two places!!
Can you give those skipped ones, please?
@KateRusse I agree there is a verse gap on page 144, but believe the First and Last verses needs to be changed (request you to make change in the next file you post).
I meant the colophones; now I see that she was just (re)marking the places where the jumps in verse numbers are seen. She did identify them properly at these two places, but has taken a wrong step in marking all the following 'start' & 'end' numbers as per theory (continuous running numbers), instead of giving the book numbers, starting from p.59. This would lead to wrong page linking for many verses esp. at those page cross-overs.
I identify the verse gap as 1425-1429.
I marked the gap as 4221-4225 in my file, as 4228 citation is in PWG.
Every adhyAya and every parva (HV has two parvas) & any upaparva (if any) will have its own colophone.
So how many total?
The last page of Harivamsa says there are 326 adhyAyas.
So there does appear to be a verse gap with 1 verse missing, either 1713 or 1714.
It is 1713 that is to be considered missing, as 1714 citation is in PWG (as I had marked in my file).
With these two examples, one can safely consider that the 'old' series (lesser number) ends before the colophone, and the 'new' series (higher number) starts after the colophone, in case of a jump at that juncture.
(I had done it in just about two hours.)
What do you mean done? She can do harder tasks, if this one is done, no need to redo, as there are no other tasks, requiring these skills @Andhrabharati
You may look at my post https://github.com/sanskrit-lexicon/PWG/issues/48#issuecomment-1020755849 reg. the same.
I thought getting more people involved in the task might be beneficial, hence waiting for others to groom-up! [Of course, I don't have a least doubt that there would be ANYONE matching me in speed or understanding things.]
There is no shortage for the pdf-linkable targets across the CDSL dictionaries; so more the 'skilled people', faster would be the work done.
Speaking of this, Jim might probably consider making a count of ls citations by "work name", like he has made a comparative list of verb (dhAtu) occurrences across the dictionaries, and every work occurring more than 5000 times (or may even be 3000) could be considered a worthy pdf-linkable target.
Should I give numeration of verses according to the book or to their real order? Or should I make one more column for the given numeration?
The verse numbers should be according to the book; as the exercise is to generate links for the citations to the book verses, and not to correct the errors in the book numbers.
Noting the gaps (jumps) in the book numbers is just an additional academic exercise.
Even the verse count in each page is used just to identify the jumps, (while I was doing it); it is really not required for indexing, which just needs the beginning and ending verses in each page.
And I was dynamically computing the same [simple difference value], as I was working in Excel, not manually counting them (which would take too much of a time); that's how I could do it so quickly!
I have created a new file according to the book numeration. First 250 pages are done. Harivansha-2.txt
@KateRusse Just to give you an example, your file has v. 1756 as the starting verse in p. 61; thus when Jim makes the linking active, the entry words "jahnu" & "nIla" (both in SLP1) in PWG link to the p. 61 where the verse 1756 containing jahnu (or nIla) cannot be seen at all (it actually being in the prev. page). So, this should be marked as the ending verse of p. 60. Hope, you understand the necessity now.
It is the responsibility of us, the humans, to give correct data to the computer programs to work correctly; they just act as per the data provided to them.
[of course the AI is a different field altogether, and none here are into it, I guess!]
----------------------
BTW, @funderburkjim, I've just seen that the jahnu entry in PWG has NO link to the "Mbh. 1,3722. fgg.", but the following verses 12,1717. 13,202. 13,7680 are properly linked.
So you still need to work on some more MBh. links; I'm sure you would look for other types of such pending combinations with just this clue.
@KateRusse Just to give you an example, your file has v. 1756 as the starting verse in p. 61; thus when Jim makes the linking active, the entry words "jahnu" & "nIla" (both in SLP1) in PWG link to the p. 61 where the verse 1756 containing jahnu (or nIla) cannot be seen at all (it actually being in the prev. page). So, this should be marked as the ending verse of p. 60. Hope, you understand the necessity now.
Please look through my new file, this mistake is already corrected there.
Yes, seen it already.
I was typing my above message, while you had updated your file and posted.
So my message actually is addressing to your Harivansha-1 file, not the revised Harivansha-2 file.
I have corrected it one more time: Harivansha-2.txt
[Of course, I don't have a least doubt that there would be ANYONE matching me in speed or understanding things.]
Yes, we can't beat you.
There is no shortage for the pdf-linkable targets across the CDSL dictionaries; so more the 'skilled people', faster would be the work done.
Yes, for years to come.
Speaking of this, Jim might probably consider making a count of ls citations by "work name", like he has made a comparative list of verb (dhAtu) occurrences across the dictionaries, and every work occurring more than 5000 times (or may even be 3000) could be considered a worthy pdf-linkable target.
There was already such a list and the biggest link targets soon will be closed.
[of course the AI is a different field altogether, and none here are into it, I guess!]
Wrong guessing again - into AI since 1999.
Speaking of this, Jim might probably consider making a count of ls citations by "work name", like he has made a comparative list of verb (dhAtu) occurrences across the dictionaries, and every work occurring more than 5000 times (or may even be 3000) could be considered a worthy pdf-linkable target.
There was already such a list and the biggest link targets soon will be closed.
@gasyoun could you get me the link to this list, so that I may help identifying the 'sources' to link?
This work done in pwg_ls2/hariv folder. Before the improvements, 9639 well-formed links to Harivamsa Calcutta edition were present in PWG. At the end of the changes, 15595 such well-formed links were present.
Also, 26 links were identified as abnormal (see file change_abnormal.txt). The changes made to markup appear in files change_01.txt and change_02.txt.
The display program component (basicadjust.php) has been adjusted to provide active links to Harivamsa (see revisions to csl-websanlexicon and csl-apidev above).
The link target is currently https://sanskrit-lexicon-scans.github.io/hariv/.
These links are present for PWG, PW (with literary source abbreviation HARIV.
and
for MW (with abbreviation Hariv.
).
You can confirm the HARIV links from these dictionary entries:
https://sanskrit-lexicon.uni-koeln.de/simple/pwg/aMSa
Here is the "resolved" Hariv. abnormal cases file, for perusal- PWG Hariv. abnormal cases.txt
My file has <1331. 5185. 10995> at the BAsvant entry, which would be properly resolved as a link.
On as second thought, the S. xxx
citations could be linked as https://sanskrit-lexicon-scans.github.io/hariv/?xxx
-- is this possible to do?
gen. MBh. xv, 463 [C] inf. cyavitum), Mn. vii, 98 ; MBh. iii ;
both are still missing @funderburkjim
400 pages Harivansha-2.txt
@KateRusse Did you observe that 110xx block of verses is repeated at two places-- pp. 345-8 and pp. 376-9? You need to mark them somehow, so that Jim would pay attention to it; otherwise the program may give wrong result, or even hang-up
Since the Mahabharata Calcutta edition contains the HarivaMSa as a 19th parvan, it seems likely that PWG links to HarivaMsa can be resolved in a manner similar to that used for references to the first 18 parva (#48).
The last page of mbhcalc for 18th parvan is 443. Page 444 is a blank page.
The pages of mbhcalc devoted to Harivamsha are from internal page number 445
thru internal page number 1007
The