sanskrit-lexicon / PWG

Boehtlingk und Roth Sanskrit Wörterbuch, 7 Bände Petersburg 1855-1875
0 stars 0 forks source link

Ramayana link markup in pwg #57

Open funderburkjim opened 2 years ago

funderburkjim commented 2 years ago

This issue describes work done to improve, I think, the markup of Ramayana references in PWG.

funderburkjim commented 2 years ago

Files related to this work are in the ramayana0 directory.

This table summarizes the changes made:

 OLD     NEW    category      
698067  714437  ALL           As of 2022-06-22
 36524   32387  NUMBER        ls starts with number
 02170   02161  UNKNOWN       ls is unknown

                Ramayana 
                abbrev        Current tooltip
 23287   37595  R.            RĀMĀYAṆA. Ohne eine nähere Angabe ist be
 04023   07076  R. GORR.      RĀMĀYAṆA, translation by Gaspare Gorresi
 00343   03287  GORR.         GORRESIO.
 00217   00328  SCHL.         ?  
 00194   00210  R. SCHL.      RĀMĀYAṆA. ? [Cologne addition]
 00269   00310  R. ed. Bomb.  RĀMĀYAṆA. ? [Cologne addition]

The OLD/NEW columns show number of instances before/after the markup changes.

As the table shows, several textual variations refer to various editions of Ramayana. The CDSL work has link targets for the Gorresio and Schlegel versions. Currently we do not have pdfs for the edition referred to as 'R. ed. Bomb.' .

The predominant abbreviation (just a simple 'R.') was found to be used for both the Gorresio and Schlegel versions. That is, a reference of form <ls>R. [1-2], y, z</ls> implies Schlegel edition, while <ls>R. [3-7], y, z,</ls> implies Gorresio edition. The CDSL display logic takes this observation into account.

In addition to the summary above, there are files showing all the links for each entry. There is one file for each of the 6 abbreviations. For instance, lsextract_v1_rgorr.txt shows all the 'R. GORR.' references.

Changes to the markup were made in multiple steps. The files change_(123).txt provide the changes, But these are probably not of much individual interest.

funderburkjim commented 2 years ago

'abnormal' references

The typical normal reference has three numbers after the abbreviation, such as <ls>R. 1, 2, 3.</ls> with 2 commas and a period (optional), and with a space between the numbers. When the printed reference is a sequence of references, with the first or first and second numbers implied, then the form of markup is <ls n="R. 1,">2, 3.</ls> or <ls n="R. 1, 2,">3.</ls> (so the implied numbers are part of the markup).

But occasionally, there are references in the printed text that do not follow this 'normal' pattern. In the lsextract_v1_X.txt files, these are flagged with the '(ABNORMAL)' notation. All the ABNORMAL instances are extracted into the lsextract_abnormal.txt file. There are 98 instances.

funderburkjim commented 2 years ago

4 number abnormals

There is one other class of abnormal references, comprising about 200 cases. These references use a sequence of 4 numbers.
182 matches for "R. [0-9]+, [0-9]+, [0-9]+, [0-9]+" in buffer: lsextract_v1_r.txt and the first number is '7'.

For instance, the first such occurs under headword aBivikrama. <ls>R. 7, 59, 3, 21.</ls>

Note on display: The basicadjust.php display logic currently ignores the 4th number and generates a link from the first 3 numbers, e.g. to R. 7, 59, 3 (gorresio).

These 4-number references are a mystery -- what edition is Bohtlingk referencing?

funderburkjim commented 2 years ago

the no-number instances.

There are typically instances where the markup shows no numbers at all. e.g. 134 matches for "<ls>R\.</ls>" in buffer: lsextract_v1_r.txt These are not flagged as abnormal.

Andhrabharati commented 2 years ago

04023 07076 R. GORR. RĀMĀYAṆA, translation by Gaspare Gorresi 00343 03287 GORR. GORRESIO. 00217 00328 SCHL. ?

@funderburkjim,

if you're willing to make the tool-tips simpler and more meaningful (as in MW), I would suggest making these as

R. GORR. > [Rāmāyaṇa, Gorresio edition, 1843--1867] 4012 occurrences GORR. > [Gorresio ed. of Rāmāyaṇa, 1843--1867] 1733 occurrences R. SCHL. > [Rāmāyaṇa, Schlegel edition, 1829--1838] 201 occurrences SCHL. > [Schlegel ed. of Rāmāyaṇa, 1829--1838] 198 occurrences R. ed. Bomb. > [Rāmāyaṇa, Bombay ed., 1859-1864 ] 151 occurrences ed. Bomb. > [Bombay ed. of Rāmāyaṇa, 1859-1864] apparently many hundreds of occurrences (mostly used to provide variant forms to Gorr. & Schl. texts)!

अग्निदायक (अग्नि + दायक) = अग्निद [R. Gorr. 2, 79, 19.] (Schl. 75, 32. °दापक).

Similarly for all other works referred in the PWG & pwk; giving the full text matter as in the "Erklärung and Abkürzungen" pages is a bit too much to read!

funderburkjim commented 2 years ago

next link target

lsextract_pwg.txt shows the latest counts of all current link markup in pwg.txt.

We have developed link targets for

66824   MBH.    MAHĀBHĀRATA, ed. Calc. (GILD. Bibl. 93).
55640   ṚV. ṚGVEDA. Es wird nach Maṇḍala, Sūkta und 
37595   R.  RĀMĀYAṆA. Ohne eine nähere Angabe ist be
24914   P.  PĀṆINI'S acht Bücher grammatischer Regel
16150   AV. ATHARVAVEDASAM̃HITĀ, herausg. von R. ROT
15636   HARIV.  HARIVAṂŚA im 4ten Bande des MBH.&#13;&#1
07310   Spr. (II)   Indische Sprüche. Sanskrit und Deutsch. 
07076   R. GORR.    RĀMĀYAṆA, translation by Gaspare Gorresi
03287   GORR.   GORRESIO.
00328   SCHL.   ?  
00210   R. SCHL.    RĀMĀYAṆA. ? [Cologne addition]

What should be next for link target? Based on frequency:

19039   BHĀG. P.    BHĀGAVATAPURĀṆA, nach Anführungen im VP.
16031   H.  HEMACANDRA'S ABHIDHĀNACINTĀMAṆI, ein sys
14915   KATHĀS. KATHĀSARITSĀGARA, ed. BROCKHAUS (GILD. B
14467   AK. AMARAKOṢA nach der Ausgabe von COLEBROOK

Based on target availability:

02984   HIT.    HITOPADEŚA, ed. SCHLEGEL und LASSEN (GIL
   I have a digitization of Kale's version of hitopadesha
   https://github.com/funderburkjim/hitopadesha-kale/blob/master/hitopadesha-slp1/hitokale_slp1_p.txt
02141   BHAG.   BHAGAVADGĪTĀ, Ausg. von SCHLEGEL (GILD. 
   Probably sanskrit-documents site has a version.  
02360   DHĀTUP. DHĀTUPĀṬHA in WESTERGAARD'S Radices (GIL
   We have pdfs of Westergaard, which are used in mw display. 

No investigations yet done on how the numbering in these versions relates to the numbering in
pwg.txt.
funderburkjim commented 2 years ago

tooltip adjustment

Let's use the simplified tooltips as suggested above, with one refinement.

The 'ed. Bomb.' item can sometimes refer to Mahabharata: refer mahAmeGa <ls>MBH. 7, 1899.</ls> {#meGavega#} ed. Bomb.

Suggest tooltip to mention both: ed. Bomb. > Bombay ed. of Rāmāyaṇa, 1859-1864 OR of Mahābhārata, 18xx-18yy @Andhrabharati Do you have the dates xx,yy?

Andhrabharati commented 2 years ago

Of course, quite many works were published in Bombay, apart from R. and MBh.

One has to look at the context to know the actual book title being referred to. That's why I did not give exact count for {R.} ed. Bomb. above!!

Yes, I do know the dates mentioned above.

funderburkjim commented 2 years ago

Well, please share those dates so tooltip may be written.

Andhrabharati commented 2 years ago

R. ed. Bomb. 1859--1864 MBh. ed. Bomb. 1863--1877

Andhrabharati commented 2 years ago

Let's use the simplified tooltips as suggested above

Glad to see my suggestion being considered so faassst. And, would you be making all the <ls> expansions thus?

gasyoun commented 2 years ago

We have developed link targets for

One of top five tasks done in such a short timeframe all because of Jim's hard labour.

No investigations yet done on how the numbering in these versions relates to the numbering in pwg.txt.

Although so, I would want you to go the target availability way.

Kale's edition will be not of much help for HITOPADEŚA, ed. SCHLEGEL und LASSEN, but still. WESTERGAARD'S Radices should be the easiest one and first to be done because of that? BHAGAVADGĪTĀ, Ausg. von SCHLEGEL should be without issues, as it has no major editions to be mixed with.

Only after I would look at based on frequency

14467 AK. AMARAKOṢA nach der Ausgabe von COLEBROOK

AMARAKOṢA is the most commented Sanskrit book of all times as per NCC (= Aufrect 2.0). I've got a reprint of the Colebrook book at my table and do not see no big issues, we can always count on @Andhrabharati love for the greatest Buddhist lexicographer and HEMACANDRA'S ABHIDHĀNACINTĀMAṆI.

As per

14915 KATHĀS. KATHĀSARITSĀGARA, ed. BROCKHAUS

I'm planning to add a Russian translation of it to https://samskrtam.ru/parallel-corpus/, so linking it to Brockhaus would be a help for me, so I could get the 5th gem in my collection. By the way, @funderburkjim we've added lately Russian comments (aditionaly to the translations you integrated before) for both RV (https://samskrtam.ru/parallel-corpus/01_rigveda.html#chapter_1) and AV (https://samskrtam.ru/parallel-corpus/01_atharvaveda.html#chapter_1).

19039 BHĀG. P. BHĀGAVATAPURĀṆA, nach Anführungen im VP.

Is the biggest of them all. @Andhrabharati any clue what to do and how to deal with it?

Andhrabharati commented 2 years ago

@gasyoun ,

you ask for something (at the spur of a moment, mostly!!), and then lose track of it soon (for unknown reasons)--

19039 BHĀG. P. BHĀGAVATAPURĀṆA, nach Anführungen im VP.

Is the biggest of them all. @Andhrabharati any clue what to do and how to deal with it?

See my response at https://github.com/sanskrit-lexicon/PWG/issues/51#issuecomment-1038628589

Andhrabharati commented 2 years ago

@gasyoun

As per

14915 KATHĀS. KATHĀSARITSĀGARA, ed. BROCKHAUS

I'm planning to add a Russian translation of it to https://samskrtam.ru/parallel-corpus/, so linking it to Brockhaus would be a help for me, so I could get the 5th gem in my collection.

You were to get the index made for the Brockhaus ed., but there seems to be no subsequent update (progress) at all-- https://github.com/sanskrit-lexicon/PWG/issues/51#issuecomment-1039977044 https://github.com/sanskrit-lexicon/PWG/issues/51#issuecomment-1040768757 https://github.com/sanskrit-lexicon/PWG/issues/51#issuecomment-1042719692

Andhrabharati commented 2 years ago

@funderburkjim

You may see my further posts (and consider doing)-- https://github.com/sanskrit-lexicon/PWG/issues/51#issuecomment-1039980178 https://github.com/sanskrit-lexicon/PWG/issues/51#issuecomment-1039989501

Andhrabharati commented 2 years ago

Now, I have an important suggestion to @funderburkjim to consider.

Is it possible to provide addl. links to different sources for the same work here as well, like done for 'roots' in MW (Westergaard and Whitney)?

The works I like to be done under this task (parallel to the present links to text (with translations) provided by @gasyoun) are (1) Rosen's RV (which is the one used in PWG etc.) and Max Müller's RV with Sāyaṇa Comm. (2) Roth & Whitney ed. of AV (which is the one used in PWG etc.), and which at many places has diff. readings as compared to the (presently linked) version provided by @gasyoun https://github.com/sanskrit-lexicon/MWS/issues/121#issuecomment-946027956

and probably (3) the Boethlingk ed. of Pāṇini (with all its appendices), parallel to the presently used ashtadhyayi.com links

Incidentally, this AV (ed. of Ro. & Wh.) reminds me of the 'simplest' task still pending with @funderburkjim for many months now-- https://github.com/sanskrit-lexicon/MWS/issues/121#issuecomment-954076062 https://github.com/sanskrit-lexicon/MWS/issues/121#issuecomment-1111307586

Andhrabharati commented 2 years ago

Now, on the other large (>10k) citation candidates-

20121 ŚKDR. ŚABDAKALPADRUMA The ed. used by PWG etc. is the original Bengali script edition by Rādhākāntadeva, which is different (and concise) at many places as compared to the presently used revised (and enlarged) ed. (in Devanagari script) by Vasu brothers (Varadāprasadavasu & Haricaraṇavasu)

16031 H. HEMACANDRA'S ABHIDHĀNACINTĀMAṆI 14467 AK. AMARAKOṢA nach der Ausgabe von COLEBROOK 12976 MED. MEDINĪKOṢA, ed. Calc.

All the 3 of these are available as good scans, and it is a very quick task to index them (I prefer using these old ed. scans as compared to the digital texts, based on other editions, available with me for over 5-6 years and with Dhaval for 2-3 years)

Andhrabharati commented 2 years ago

For instance, the first such occurs under headword aBivikrama. <ls>R. 7, 59, 3, 21.</ls>

Note on display: The basicadjust.php display logic currently ignores the 4th number and generates a link from the first 3 numbers, e.g. to R. 7, 59, 3 (gorresio).

These 4-number references are a mystery -- what edition is Bohtlingk referencing?

You yourself know the source very well, @funderburkjim!

Didn't you post this elsewhere?-

RĀMĀYAṆA. Das 1ste und 2te Kāṇḍa nach der Ausg. von SCHLEGEL, das 3--6te nach der von GORRESIO, das 7te nach der Bomb. Ausg., wenn nicht ausdrücklich eine andere Ausgabe genannt ist. Eine eingeklammerte Zahl bezieht ist sich auf ed. Bomb.

(English via Google Translate) The 1st and 2nd Kāṇḍa after the output of SCHLEGEL, the 3rd--6th after that of GORRESIO, the 7th after the bomb. Edition unless another edition is explicitly mentioned. A number in parentheses refers to ed. Bomb.

image

It is the Bombay ed. of Rāmāyaṇa, as the screenshot above clearly shows. [The अभिविक्रम word occurs in the 3rd प्रक्षिप्त सर्ग verse 21, after the regular 59th सर्ग in the 7th काण्ड (उत्तरकाण्ड).]

Andhrabharati commented 2 years ago

@funderburkjim

About other 4 level R. numbers--

I presume most of them would be typo errors. One has to look at them individually and resolve.

For example, the first occurrence "<ls>R. 2, 50, 8, 9. 91, 60.</ls>" under the entry इ [ID=9938] should be <ls>R. 2,50,8. 9. 91,60.</ls> (in my style without unwanted extraneous spaces).

image

The word उपेत occurs at R. 2,50,8. (as धान्यधनोपेत), R. 2,50,9. (as उद्यानाम्रवणोपेतान्) and then at R. 2,91,60 (as माल्योपेताः), all from the Schlegel ed.

The typo error in this case is a comma instead of a dot between 8 and 9! --------- BTW, there is one unrelated (yet important) observation while I was looking at it--

I've been using your https://sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/pwkvn/03/ link from its inception, and this is the first time that I wanted to click on the page link [Page1-0769].

When I tried to click on the PWG page link, it gave a "Page not found" error (https://sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/pwkvn/03/servepdf.php?dict=pwg&page=Page1-0769).

However, same link at the PWG individual search link (https://www.sanskrit-lexicon.uni-koeln.de/scans/PWGScan/2020/web/webtc2/index.php) did the job properly giving the link (https://www.sanskrit-lexicon.uni-koeln.de/scans/PWGScan/2020/web/webtc/servepdf.php?dict=pwg&page=Page1-0769).

Pl. see why this error popped up at the combined search link.

[The link, in the combined search page, indicates that the page is being searched within the pwkvn folder ONLY, instead of 'considering' the actual "window" where the click took place.]

Andhrabharati commented 2 years ago

SORRY, I was looking at the old version of pwg.txt; just got the latest one from csl-orig/v02, and seen that all the remaining 4-numbered R. links are from the 7th Kāṇḍa only.

From the above work, we can presume that all those are referring to the prakṣipta sarga verses therein.

Andhrabharati commented 2 years ago

I had missed your statement at the beginning itself "and the first number is '7'."; hence wasted some time.

I need to be more vigilant!!

funderburkjim commented 2 years ago

Have now added ls markup for 'ed. Bomb.' and 'ed. Calc.' For instances, there are now

And a few other ls references involved changes. See the history of lsextract_pwg.txt for more details.

funderburkjim commented 2 years ago

@Andhrabharati You show a very clear print of a page from the uttarakanda of 'Bombay edition' of Ramayana.

How can CDSL get pdfs for a link target for Bombay edition,, at least for those 'prakzipta sarga' references?

Andhrabharati commented 2 years ago

@funderburkjim There are 14 "<ls>" (i.e. space following the tag), and at two places

<L>50508<pc>4-1126<k1>prAjApatya

image

and <L>81815<pc>6-0097<k1>yA

image

it makes the link 'missing'! [Other 12 places seems to have no side-effect.]

And there are two dangling ">" places, but without any adverse impact.

Andhrabharati commented 2 years ago

How can CDSL get pdfs for a link target for Bombay edition,, at least for those 'prakzipta sarga' references?

@funderburkjim Here are the extracted portions of all the 13 prakzipta sargas in Vol. 7--

after 23, 1-5 of them extract-1.pdf

after 37, 1-5 of them extract-2.pdf

and after 59, 1-3 of them extract-3.pdf

These are from a different publisher, though still from Bombay, at a later date.

You may notice that the intended word 'aBivikrama' is not present in this edition, but instead 'ativikrama' is there. [As you did not respond to my suggestion about "using (having)" the actual sources used in PWG etc. above (for other works), I resorted to giving these pages from a different source than the one mentioned in PWG. Of course, you (or someone else) can get the "actual" Bombay ones as well, if seriously interested, by spending a little time for the task.]

Andhrabharati commented 2 years ago

There are 3 <ls n="?"> places in the whole text.

At line 473400, it is to be the prior to prior <ls> item, i.e. "Ind. St."

At the lines 734614 & 734615, it has to be "Bhāg. P."

1,11,37 [Burnouf ed. of Bhāg. P.] image

10,61,4 [Another source for Bhāg. P.] स्मायावलोकलवदर्शितभावहारिभ्रूमण्डलप्रहितसौरतमन्त्रशौण्डै: । पत्न्‍यस्तु षोडशसहस्रमनङ्गबाणैर्यस्येन्द्रियं विमथितुं करणैर्न शेकु: ॥ ४ ॥

@gasyoun did you notice these citations from Bhāg. P.? (in continuation to https://github.com/sanskrit-lexicon/PWG/issues/57#issuecomment-1165077877 and corroborating the point therein)

Andhrabharati commented 2 years ago

Further on "R. 7":

Gorresio ed. appeared in 1867, so if any reference to R. 7 is seen in the PWG Tomes 1-4 (1855-1865) cannot be indicating it; there can be a chance in the later volumes for this.

In any case, unless clearly mentioned otherwise, "R. 7" always has to be taken to be from the "Bomb. ed."

[@funderburkjim, hope you would "notice" this post and take corrective action in the linking.]

Andhrabharati commented 2 years ago
  1. 22 of the 98 'Abnormal cases' listed are the R. 7 items, and could be presumed to be 'Normal', if Jim acts for changing the marking as mentioned in the prev. post.

  2. And there are 4 cases of "R. ed. Ser." (I see 5 of them in the file, one not tagged with <ls>!!), which indicate the Serampur ed. of Rāmāyaṇa [all these are taken from the works of Bopp, Benfey and Westergaard, which were published earlier to Gorresio's volumes]; and it is not out of place to mention that these Serampur volumes are supposed to have been the "source" for Gorresio. Incidentally there is a Serampur ed. of MBh. as well. [Calcutta, Serampur and Burdwan were three places, where the book-printing was concentrated in Bengal those days.]

  3. In the line 868415, <ls n="R. SCHL. 2,">1, 26, 1.</ls> <ls n="R. SCHL. 2,">45, 5</ls> (<ls n="GORR. 2,">46, 5</ls> <ls>GORR.</ls>), which has the abnormal case of <ls n="R. SCHL. 2,">1, 26, 1.</ls> has to be <ls n="R. SCHL.">1, 26, 1.</ls> <ls n="R. SCHL. 1,">45, 5</ls> (<ls n="GORR. 1,">46, 5</ls> <ls>GORR.</ls>)

Notice the larger size "1," in this image image

funderburkjim commented 2 years ago

@Andhrabharati Probably you have url for downloading the Bombay edition of Ramayana relevant to PWG. Please share this url.

funderburkjim commented 2 years ago

Implemented several changes per suggestions above. For details, see the notes in ramayana0/readme.txt at temp_pwg_6

funderburkjim commented 2 years ago

Need for bombay edition

There are quite a few volume 7 references to Ramayana. For instance, with the basic 'R.' abbreviation, I find (using lsextract_v1_r.txt)

Using a small sample of those 1461, I tried the Gorresio volume 7, and concluded that the references were not resolved in Gorresio. e.g. under the first one 'aSezas', we find reference <ls>R. 7, 1, 11.</ls>. But at https://sanskrit-lexicon-scans.github.io/ramayanagorr/?7,1,11, we find no usage of 'aSezas' in the 11th verse.

This supports the observation above that the volume 7 references (whether with 3 or 4 numbers) need to be linked elsewhere, presumably to the Bombay edition.

To resolve all the required Bombay edition links, we will need

Andhrabharati commented 2 years ago

Seen that a majority of changes in "PWG: Minor ls markup revisions." are introducing a space between dot and a number "\.[0-9]".

  1. There are nearly 6k of them (5788) still in the data that can globally be replaced, except at 5 places (in metalines) <L>13188.1, <L>51684.1, <L>87082.1, <L>96987.1 and <L>102721.1

  2. Also there are 31777 ",[0-9]" places, where a space between comma and a number can blindly be introduced. There are just 2 places containing "regular numbers" that might be retained with commas , namely {%den 54,675,000sten Theil eines%} and <ls>4,320,000</ls> {%Jahre%}. It may be noted that the second one above is not to be tagged <ls>, being a regular number. [There are more such places that need untagging.]

Andhrabharati commented 2 years ago

@Andhrabharati Probably you have url for downloading the Bombay edition of Ramayana relevant to PWG. Please share this url.

I do not like to re-iterate, but here is my response

[As you did not respond to my suggestion about "using (having)" the actual sources used in PWG etc. above (for other works), I resorted to giving these pages from a different source than the one mentioned in PWG. Of course, you (or someone else) can get the "actual" Bombay ones as well, if seriously interested, by spending a little time for the task.]

Of course, if you give your opinion about my above suggestion, I would surely extend my support.

Andhrabharati commented 2 years ago

Need for bombay edition

To resolve all the required Bombay edition links, we will need

Another reason: all pw citations of R. are from the Bomb. ed. only.

As I had already mentioned elsewhere, the Bomb. ed. of any book came out with such a quality (or value addition, in terms of formatting or commentaries) that got them first place as a reference, dethroning all other earlier editions, be it for MBh. or R. or Pancat. or .... (the list goes on).

Andhrabharati commented 2 years ago

Here is the list of the earliest prints of Ramayana, for whatever worth it has-

image

[Probably @gasyoun might have some use of this]

1859 and 1864 are by Gujarati Press and 1888 is by NirnayaSagar Press (both in Bombay). [Both these went into multiple reprints/revisions later on].

Andhrabharati commented 2 years ago

Here is another list, with more details-

image

funderburkjim commented 2 years ago

@Andhrabharati If we can identify the 'actual' source of a reference in PWG, then we should do so. As we know, the 'R. x,y,z' abbreviation in pwg is normally used to refer to Schlegel edition when x is 1 or 2, and Gorresio edition when x is 3-6, and Bombay edition when x is 7. In the display program (basicadjust.php), I have implemented this for x=1 to 6. Currently, x=7 also links to Gorresio, which is incorrect and should instead link to Bombay edition. Unfortunately, we do not yet have a link target for Bombay edition. Your help with the Bombay edition used by PWG would be appreciated. If you do supply links to good downloads of Bombay edition pdfs, I hope you will also provide the necessary 'verse-page' maps.

funderburkjim commented 2 years ago

introducing a space between period and digit

Carried out this for the remaining cases.

Did not similarly introduce a space between comma and digit. Although I have done this in the cases of link targets (such as Ramayana), I'm not sure of this choice.

gasyoun commented 2 years ago

you ask for something (at the spur of a moment, mostly!!), and then lose track of it soon (for unknown reasons)

It's good to see that there are more systematic members of the team around. That's a good catch!

You were to get the index made for the Brockhaus ed., but there seems to be no subsequent update (progress) at all

Exactly and for upcomming year I hardly believe I will get there.

Is it possible to provide addl. links to different sources for the same work here as well, like done for 'roots' in MW (Westergaard and Whitney)?

A nice idea, but is the time fit?

(1) Rosen's RV (which is the one used in PWG etc.) and Max Müller's RV with Sāyaṇa Comm.

Rosen's RV is a bit outdated, we could say. Sāyaṇa Comm. remain non-digitized?

(2) Roth & Whitney ed. of AV (which is the one used in PWG etc.), and which at many places has diff. readings as compared to the (presently linked) version provided by @gasyoun

I'm thinking of adding it to my version. At least I do not see nothing for Jim here right now. But I agree Whitney's AV is a good one.

(3) the Boethlingk ed. of Pāṇini (with all its appendices), parallel to the presently used ashtadhyayi.com links

Is my reference book on table, there are good scans. But linking to it means we need to make a table of contents first.

All the 3 of these are available as good scans, and it is a very quick task to index them (I prefer using these old ed. scans as compared to the digital texts, based on other editions, available with me for over 5-6 years and with Dhaval for 2-3 years)

Agree, but only if you are there to take them up. I agree that old scans are valuead higher than not doublechecked digital editions.

The typo error in this case is a comma instead of a dot between 8 and 9!

How do you see them so quickly and easily?

From the above work, we can presume that all those are referring to the prakṣipta sarga verses therein.

So should not remain an issue any longer for us?

an get the "actual" Bombay ones as well, if seriously interested, by spending a little time for the task.

You mean looking at links given by you earlier?

did you notice these citations from Bhāg. P.? (in continuation to #57 (comment) and corroborating the point therein)

No, I did not

Calcutta, Serampur and Burdwan were three places, where the book-printing was concentrated in Bengal those days

Nothing I new of Burdwan before. Can you tell the story behind it?

[There are more such places that need untagging.]

I'm a fan of how you work.

Bomb. ed. of any book came out with such a quality (or value addition, in terms of formatting or commentaries) that got them first place as a reference, dethroning all other earlier editions

Interesting thougth. So the Calcutta standard never got so high?

[Probably @gasyoun might have some use of this]

Lovely, lovely, lovely

Did not similarly introduce a space between comma and digit. Although I have done this in the cases of link targets (such as Ramayana), I'm not sure of this choice.

Why still not sure?

If you do supply links to good downloads of Bombay edition pdfs, I hope you will also provide the necessary 'verse-page' maps.

Hope @Andhrabharati sees how badly we need him and value his hard labour

Andhrabharati commented 2 years ago

Exactly and for upcomming year I hardly believe I will get there.

We all know that you won't be doing the work, but delegate to others. This work is supposed to be done by KateRusse (through you); is she also occupied or is it that the present situation in Russia is making things difficult there?

A nice idea, but is the time fit?

If there is a will, there is a way. And I suggested this to Jim, as he is at the linking task/spree now.

Rosen's RV is a bit outdated,

It is not the question of being outdated or current still. It is what is used in the citations here!

But linking to it means we need to make a table of contents first.

Any linking to scanned book(s) needs this task to be done first.

I agree that old scans are valued higher than not doublechecked digital editions.

I see not many errors in my texts, and guess Dhaval also agrees to it. Even otherwise, my version and Dhaval's could be compared together once, to weed out any errors remaining in our digital texts. [He did so for some works already!] But the main reason for my argument is the sources used are different (and have v.l.s at places)

How do you see them so quickly and easily?

Again, if there is a WILL, one would find a way out to get/do what is needed.

So should not remain an issue any longer for us?

Jim has already concluded the point.

You mean looking at links given by you earlier?

I did not give any link for this, but just passed on the clues (!!).

Nothing I knew of Burdwan before. Can you tell the story behind it?

Just giving a small piece of info, though much could be said on this-- Have a look at my earlier post reg. Vardhaman ed. of MBh. is pwk repo

Andhrabharati commented 2 years ago

Agree, but only if you are there to take them up.

Let Jim decide his priority, and give the sequence; for me all these are very minute tasks, that could be done practically in 'no time'.

Andhrabharati commented 2 years ago

@funderburkjim

Did not similarly introduce a space between comma and digit. Although I have done this in the cases of link targets (such as Ramayana), I'm not sure of this choice.

I interpreted this as your concern that it might impact the places where 'not to be done'.

I've checked the latest file again now, and seen that (you had corrected the two places I mentioned above and) all the remaining occurrences are within <ls ... /ls> tags, either already linked or to-be-linked yet. So, I see no harm doing this correction at once. [Now the count is 31738, instead of the earlier 31777 count.]

You're the final judge, of course.

funderburkjim commented 2 years ago

Did not similarly introduce a space between comma and digit

The reason is aesthetic: (a) "R. 1, 2, 3. 4, 5, 6. 7, 8, 9." vs. (b) "R. 1,2,3. 4,5,6. 7,8,9.". Current practice is (a). But maybe I should have done (b) for ease of reading. Obviously a relatively minor point. Either way is probably acceptable.

Andhrabharati commented 2 years ago

Pl. recall that you had done so [(b) style] in pwk earlier, looking at my file.

So, why not do the same in PWG as well?

Andhrabharati commented 2 years ago

Sorry, it is a hasty reply from me, without looking at my prev. post.

What I had mentioned earlier is "exactly opposite", to introduce the spaces where absent, so that the full text will be of same '(a) style'.

As compared to those 31k places, over 500k places have the space.

Either way, the data should have a consistent single style throughout.

Andhrabharati commented 2 years ago

The very reason for my raising this point is that those 31k places would not get the 'link' status (presently those not being under RV., AV., MBh., R., ... tags)!! [Of course, you'd be resolving them when they are linked, but it would take a long time for all those <ls> works to be finished.]

Andhrabharati commented 2 years ago

We've one such issue noticed in MW already-

https://github.com/sanskrit-lexicon/MWS/issues/130#issuecomment-1159573548

and some variant case at https://github.com/sanskrit-lexicon/PWG/issues/23#issuecomment-1164775531

Andhrabharati commented 2 years ago

Also another case to be handled in PWG yet-

https://github.com/sanskrit-lexicon/MWS/issues/130#issuecomment-1063022767

gasyoun commented 2 years ago

(b) for ease of reading.

@funderburkjim agree b is easier to read.

gasyoun commented 1 year ago

@funderburkjim Indische Sprüche : sanskrit und deutsch All 3 volumes at great quality https://www.deutsche-digitale-bibliothek.de/item/OQKJRI3ZMXUFQ4U33MADZBIRDIP5TF62?isThumbnailFiltered=true&query=Sanskrit&rows=20&offset=40&viewType=list&firstHit=OVCLS6W7ANAP2F7EU7HLE6I7LMV3QAAJ&lastHit=lasthit&hitNumber=47 - @Andhrabharati seen it?