Open gasyoun opened 3 years ago
Yes, I do understand, Marcis.
I was looking at PW few days back, incidentally.
What kind of help do you need?
You ask others for a list; now you have to give a specific list!!
@Andhrabharati The '?' in pwgbib_input.txt file above are where help is needed. We don't know what the abbreviation tooltips should be, as we have not found the abbreviations among those in PWG printed list.
Let me see it, once I am done with Apte present session.
@funderburkjim would you pl. make a Devanagari/unicode version of the pw.txt & pwg.txt files, for me to start looking into the data and go further?
And is it PWG (1855) or PW (1879) that you want me to look at? [Guess the latter one should be having some more addl. biblio entries than the former one!]
PWG (1855)
First the older one.
@funderburkjim My first look is at the last 100pp. in the 7th volume of PWK-
General - Index zu den sechs Nachträgen und letzte Nachträge.
(Eine arabische Ziffer bezeichnet den Band, in welchem das nachgetragene Wort steht, eine voranstehende römische, dass das Wort auch in dem und dem Haupttheile des Werkes sich befindet.)
This clearly says it includes an addenda, either a new entry or or an addition/revision to the existing entries.
Just looked for few of those, and they are MISSING in the digitisation.
I understand this work was taken up in 2003 by Thomas, and wonder if the Addenda content is in some other file or missing altogether.
Is this issue noted earlier by some (or any)one, in the last 18 years of its life?
Now looked at the PWG vol.7.
The last page has-
The very last entry हेवाकिन् is missed in the typed file.
Incidentally look at the prefixed verbal words marking with the preceding dash (-) in the above image [under हन् and हृ (marked हर् in the print)].
As it is clearly marked in a separate line, it is duly typed as such in digitisation.
And this is exactly how I did in my splitting (even without looking at PW books till now) in AP57, AP90 (now sent to Github for Cologne), and so in others like MW (lying privately) and CAP/CAE (already there at Andhrabharati site since the beginning (2015).
A screenshot given below.
The same entry at Cologne is displayed as-
[Why are the commas missed in the Cologne display in the Devanagari text in the very first line?]
Just opened the pwgbib_input file. and trying to understand its structure.
And the last entry in it,
c.2116 Антарабида Антарабида Антарабида = ? [Cologne Addition]
does not seem appropriate in the bibliography of persons and works.
It appears to be a place name from the context
[btw, why are N. pr. & Vgl. remaining without tooltip expansion here? are the abbr.s not yet handled in this work?]
and also as seen in the cited ref. (p. 55 of Wassiljew), the text of which is given below.
Incidentally the (अन्तरर्वेध) in the matter here is an obvious printo for (अन्तरवेध).
Having the text in Devanagari makes it quick, for me to see the entries this way; it is always convenient doing such works on local computer with text file and pdf (or the book) opened side-by-side, than through browser (in my opinion). For this very reason, I have asked for the converted file yesterday.
Looked at the pwg-meta2 file, and my comments are-
It has no mention of the m̃ and M̃ characters that were discussed in one of the issues in the forum (to be without unicode equivalents initially and then seemingly finding them as composite characters). [I had posted a message myself in that issue yesterday, with their unicode numbers]
<VN>...</VN>
Only 3 instances. Usage not known.
<VN>
stands for Verbesserungen Nachträge, and all the three places having it are tallied with the material at the end of vol.1 (after p.1142) running into 3pp. (6 columns)
And many (rather, almost all) others in those VN pages (that I randomly checked) are missing in the text.
Only Thomas should be able (if at all!!) to give the reason; probably they might've thought of stopping incorporating them in the main text, just like in the PWK case where they did not touch the Addenda (making their life easy; no need for putting the brains in locating the intended text placement/correction)- but just digitized the material in SCH (who seems to have integrated the addenda into PWK main text) in another project.
[Guess this speaks out the way I work (when I work), looking all around without restrictions/limitations.]
Continued further on the VN material and found that-
There could be some missing entries in the whole of the typed text (in the main or VN pages); but that is quite understandable. [human error!!]
Looked inside the pwgbib_input file, and tried to separate ‘c-’ marked entries, which are probably culled out from the text matter in the book, from the listed ones.
Seen that the first ‘c-’ occurred within the Vol. 4 list, where the actual entry is missed.
Then probed for any missing numbers in the pwgbib_input file and found the missings listed in the volumes-
1.102 gaṇa S. 2.037 Ratnam. 4.004 Halāy. 4.015 Siddhāntaśir. (ā marked in the book as ‘a’)
[This missing could've happened while converting the original German file from Thomas into IAST etc. later.]
Need to see if any listed item is missed in the original file itself, looking at the print pages all over.
I wish @funderburkjim could spend some time generating a file (programmatically) with some extra details, in a different format to search in the typed text for the bib. entries.
If he is willing, I can give the format I want to have.
[The input file has IAST names and the pwg.txt matter is all in German spellings as in the book; so it is difficult to correlate both of them manually.]
-----------------
Interesting that just two works contributed one-third (168) of the listed 560 entries [including the a-z tagged ones within the 1-xxx to 4-xxx block, which probably are added by Cologne team]; they are Gildemeister (102) and ŚKDR (66).
So posting the GILD. here, as it might be of some use (to find some more from it).
Bibliothecae Sanskritae sive recensus librorum Sanskritorum - Gildemeister.pdf
It appears to be a place name from the context
Agree that the Russian name of a kingdom is not abbreviation.
digitized the material in SCH (who seems to have integrated the addenda into PWK main text) in another project.
We do not know if all. Guess - no.
If he is willing, I can give the format I want to have.
@Andhrabharati please show it here.
[after Vol. 7 main text; no integration done]
Means only a small fraction of addenda is typed in PWG.
No, the majority of it is typed; just about 6-7 pages in the first 3 vol.s are missed.
You didn't properly go through my counts reported, it seems.
It is PWK that has all the 100 pp. missed, not PWG.
Just thought I should look at all PWK volumes for the VN content, and here is the summary-
For Vol. 1 1281-1 to 1299-3 (19 pp)
For Vol. 1-2 2285-1 to 2301-3 (16 pp)
For Vol. 1-3 3246-2 to 3265-3 (20 pp.) [Cologne (bookmarked) scan copy ended with just 3256-3; are the further pages missing with them?]
For Vol. 1-4 4290-3 to 4302-3 (11 pp.)
For Vol. 1-5 5240-2 to 5264-3 (25 pp.)
For Vol. 1-6 6292-3 to 6306-3 (11 pp.)
For Vol. 1-7 7289-1 to 7390-3 (101 pp.) [As these pages contain General Index as well, we can take some 15-20 pp. to have been missed here for VN entries]
Glad that all these pages are consistently missing in the Cologne text. (No impartiality/inconsistency as in PWG!!)
[The input file has IAST names and the pwg.txt matter is all in German spellings as in the book; so it is difficult to correlate both of them manually.]
Looked again in the pwg.txt; it is with IAST spellings only, not with Germanish spellings. So I can start looking at the bibl. entries, as asked by @gasyoun and @funderburkjim. [No need for getting additional details through Jim, as I mentioned yesterday; I can handle the required changes myself.]
Will start with checking the content of first part, as noticed few typo/conversion errors; and also to identify if all the listed ones in the print are covered.
A majority of the second part (with 'c-' entries) should mostly be correlating with the first part, and then to get the details of the newer entries.
Meanwhile, I look for Jim's support in giving me the Devanagari files.
Just looked at the Schmidt book as well.
As I presumed earlier, Schmidt did not integrate the VN entries into the PWK body, but just put them all into a single volume and also added more material gathered by him and others over the years.
So as this is supposedly available as a separate consolidated (and updated) text, this could to be thought of getting integrated into the main text of PWK.
This is exactly what Schmidt has opined-
Aber mögen sie doch diese Zugabe einfach als eine Materialsammlung betrachten, die vorläufig niemand zu benutzen braucht, wenn Ärgernis daran genommen wird: erst der Herausgeber eines neuen pw mag es tun. Wenn er nur recht bald käme!
[Probably a digital edition of integrated pw(k) is possible to do, if not a print version.]
Wonder why no one has attempted to do so (the integration work); not even the Halle Univ. people who are on a NWS project, who say it is a cumulative addendum; it is by no means a cumulative (consolidated) one, but just one more addendum after Schmidt!
Am I wrong in thinking of integrating the addenda/annexures into the main text(s), like the way I started with MW data? [But that's what is the intent of having the additions and corrections being issued, to be read appropriately along with the main text]
What else a lonely person could say than this?-
Schließlich noch eine Bemerkung, die zu unterlassen gegen meine innerste Überzeugung wäre: ich habe mir eine neue Ausgabe des pw ganz anders vorgestellt! Es wäre an der Zeit gewesen, Böhtlingks Arbeit mit Hilfe des gesamten, seit 1889 erschienenen neuen Materials zu ergänzen. Wie reich die Ausbeute gewesen wäre, habe ich besonders an den von mir gelesenen bhāņa's gesehen, von denen manche trotz ihres geringen Umfangs mehrere hundert Nova ergehen haben. Aber natürlich müßten sich, wenn gründliche Arbeit geleistet werden soll, alle Indologen der Welt zusammenfinden! Daß dies bald geschehen möchte, ist mein aufrichtiger Wunsch.
But even after about a century, nothing of the sort is happening.
Pune Dictionary project started off with an ambitious (but impossible) plan; but it wont be crossing even the vowels part any time soon (for over a score of generations). [Day-by-day, interest of people and institutions is shifting away from the literary activities.]
Mußte ich doch auch bei ganz geläufigen Vokabeln immer erst im pw nachschlagen - bisweilen an vier Stellen!
[Schmidt is saying just 4 places; I see that Vol. 1 has addenda in all 7 vol.s (as listed above), so one might need to see all 7 places for some word!!]
This is overcome by integrating the parts/pieces (one-time effort by one or two persons), thus saving good amount of scholars' time, who can spend it in more beneficial operations. [But are there any scholars whose time is to be saved now? All of them spend the time in activities for monetary or administrative reasons, not for academic purposes. Gone are the days of scholarly affairs!]
No, the majority of it is typed; just about 6-7 pages in the first 3 vol.s are missed.
Found VN pages in Vol. 4 (2 pp.) and Vol. 6 (1 p.) as well, that are not typed. [PWG] Both these are at the beginning of the books, as against the other volumes that have VN pages at the end.
The 2nd VN page in Vol. 4 also has bibl. data (½ p.), so it is effectively 1½ pp that needs typing.
@funderburkjim Can you get these missed pages (about 10 of them) typed through Thomas? This will make PWG set completely typed. [I thought I would do them, but I do not follow the Cologne/Thomas syntax; so it would be in a different style to be along with all other matter.]
Here are the scans of above mentioned pages.
PWG V.1 VN pages.pdf PWG V.2 VN pages.pdf PWG V.3 VN pages.pdf PWG V.4 VN pages.pdf PWG V.6 VN page.pdf
Here are the ending pages in PWK V.3 that missed in the Cologne bookmarked scan (JIC they are needed).
@funderburkjim I got a discrepancy at the very beginning!
<ls>ṚV.</ls> <ls>PRĀTIŚ. 2, 19. 21</ls>
There are 9 places where the ṚV. and PRĀTIŚ. are separated, whereas they are supposed to go together and listed in the book also as such.
Should such occurrences also to be seen?
No, the majority of it is typed; just about 6-7 pages in the first 3 vol.s are missed.
Understood, thanks for clarification.
It is PWK that has all the 100 pp. missed, not PWG.
Wonder how many ghost-words we have because of not fixing errors.
@gasyoun, you were asking about the Spr. (II) entry in the biblio list sometime earlier in another thread. It is the c- entry that landed in the main list that I mentioned above (lying between 4.014 and 4.016, in place of 4.015)
c.2117 Spr. (II) SPR. (II) Indische Sprüche. Sanskrit und Deutsch. Herausgegeben von Otto Böhtlingk. Zweite vermehrte und verbesserte Auflage. 1870-1873
Wonder why no one has attempted to do so (the integration work)
In 2012 https://www.sanskrit-lexicon.uni-koeln.de/scans/PWScan/disp2/index.php
by no means a cumulative (consolidated) one, but just one more addendum after Schmidt
Seems I have to agree.
Should such occurrences also to be seen?
Looking at the screenshot still not sure what you mean. That there should be not 2, but 1 abbreviation?
Spr. (II)
Have made new scans of both editions lately, see https://vk.com/samskrtamru?w=wall-88831040_11452
Should such occurrences also to be seen?
Looking at the screenshot still not sure what you mean. That there should be not 2, but 1 abbreviation?
I thought I gave full info by putting
<ls>ṚV.</ls> <ls>PRĀTIŚ. 2, 19. 21</ls>
There are 9 places where the ṚV. and PRĀTIŚ. are separated, whereas they are supposed to go together and listed in the book also as such.
after the screenshot.
Do I need to elaborate more like this?
PRĀTIŚĀKHYA zun ṚGVEDA, citirt nach Paṭala und Versen. Hdschr. S. ROTH in der Einl z. NIR. S. XLVII.
became
ṚGVEDA. Es wird nach Maṇḍala, Sūkta und Ṛc citirt. ROSEN zu ṚV. verweist auf die Anmerkungen in: Rigveda-Sanhita, liber primus, sanskritè et latinè; edidit FRIDERICUS ROSEN. London 1838.
with the
PRĀTIŚĀKHYA.
hanging!
Spr. (II)
Have made new scans of both editions lately, see https://vk.com/samskrtamru?w=wall-88831040_11452
Good to know this.
Also every person of some fame did his own Selections (Chrestomathie) those days, and it would be a good project to compile them together.
I started kind of liking this German font, though some glyphs are quite different than the regular (later) fonts, like दु; which a reader of non-European base (I take Russia also to belong to Europe) might be surprised at the first look.
Wonder why no one has attempted to do so (the integration work)
In 2012 https://www.sanskrit-lexicon.uni-koeln.de/scans/PWScan/disp2/index.php
This is good to show them side-by-side, though it is not the kind of integration I meant (incorporating the addition/correction into the main text itself).
But why is this dispaly method discontinued at Cologne now?
@funderburkjim
There are over 9500 <ls>?
occurrences which need to be properly marked again.
The present position of it is making the understanding erroneous many a times.
In majority of the cases it is to be shifted somewhere after few letters or after the punctuation or braces; or is to be duplicated (split into parts) many times.
The 2012 Cologne style display is at the NWS now, and it is more evenly placed than Cologne's.
Will start with checking the content of first part, as noticed few typo/conversion errors; and also to identify if all the listed ones in the print are covered.
By conversion errors, I meant things like
ŚKDR. to be ŚKDr. [Ś(abda)K(alpa)Dr(uma).]
[In this ŚKDR., the R. is very clearly in small cap.s in print, as compared to other three letters in large cap.s] Only the first letter of the word should be in cap (Title case as it is called), even in the comp. abbr. like this.
By typo errors, I meant things like
CKDR. instead of ÇKDR. thus remaining unconverted to ŚKDR.
Just did some random checks; there are quite many still without the ls tagging.
Also every person of some fame did his own Selections (Chrestomathy) those days, and it would be a good project to compile them together.
By end of year I should have most Western ones scanned.
I started kind of liking this German font, though some glyphs are quite different than the regular (later) fonts, like दु
This is why in 2005 I made a remake of it. But what you call German is French. German is totally different.
But why is this dispaly method discontinued at Cologne now?
Because Jim can't do it all on his own and all my coders fade before real amount of work is done.
This is all with just preliminary and random looks.
Also every person of some fame did his own Selections (Chrestomathy) those days, and it would be a good project to compile them together.
By end of year I should have most Western ones scanned.
I meant for Sanskrit. Are you also talking about the same?
I meant for Sanskrit. Are you also talking about the same?
Sanskrit only, right. One of 3 biggest Russian libraries is working on the project https://libfl.ru/
@funderburkjim
Here comes the first piece of work under the task that I took up.
I have just looked at the modified abbr.s, not the printed abbr.s with ALL CAPS, to start with.
PWG Bibiliography (listed).txt
Notes.
*[All items corrected or newly added looking into the print pages are marked with a ``, for easy identification. The corrected items have other info from the biblio file, and the new entries need to be filled up fully (they just stand out as single lines)]**
If this way of my handling is acceptable, I can continue further.
And now I will require all the tooltip expansions by you (as initial Cap.s) along with the Titles (with all CAPS in this biblio file), as I've noticed some corrections in the tool tips while searching at the site. This will facilitate proofing/correcting both the original biblio entry and the tooltip entry together.
Will be awaiting your response, to do any more work on this.
BTW, during my lookout for missed <ls>
taggings (in the pwg text) noticed 2 f2 and ~50 M2 occurrences in the SLP1 strings that need attention.
They are being rendered as f and M followed by the numeral 2, where as they should be F and candrabindu as per the book.
Also in addition to <ls>?
entries as I mentioned yesterday, even the regular <ls>
entries need to be split/duplicated at many places, as noticed in my searching.
And many cases are there similar to "RV. Prāt." reported yesterday, which are unnecessarily separated.
Is it required to check and correct the biblio file abbr. expansions?
As I understand, they are NOT being used anywhere; only the modified text is used to show as tool tips.
Is that modified text available totally in some file?
Is it required to check and correct the biblio file abbr. expansions?
I guess so.
If they are not used anywhere directly, why to spend time correcting those?
If the tooltips are generated on-the-fly using those, then the logic is to change.
For, I see no simple method to correct those ALL CAP words, in a pure text file, to get words like MBh. and ŚKDr.
And I strongly feel such forms are intended by the compilers, which we have to follow.
This is not to demean or belittle the work done in generating the pwgbib_input file, but it has many missings and wrong reportings.
Just for e.g., a. a. O. has been listed as only one variant (c.1364: Pat. a. a. O.), whereas it occurs in over 80 variations.
Also see the various marking differences in the text, just in these 80+ listed. [The text file definitely needs good cleaning first for the markings.]
This entry (a. a. O.) luckily happened to be an abbr. (not <ls>
) so not to worry much, otherwise we would have had so many missed <ls>
entries; but I saw missing <ls>
entries as well in my rough and random searching.
I do not think this is a work by @funderburkjim, as I've seen the list prepared by him for VCP abbr.s (with very few missings for another reason), which is quite close to my final list. [He is the most competent person in the Cologne team to prepare the list.]
As he is now too much overloaded, he cannot (and should not) be asked to prepare the list for PWG and I guess only I should do it, for making it complete.
But this file being in SLP1, I cannot work faster on it.
What can I do except occassional cheks like this, till @funderburkjim finds some time to prepare the Devanagari files for me? [I do not want to do a partial work, by looking only at the list I was asked to fill-up; that is not my way of working.]
@funderburkjim
In the file I attached at https://github.com/sanskrit-lexicon/PWG/issues/37#issuecomment-846574085, I had marked thus-
*2.037 Ratnam. (This is a ref. other than 1.259 which is just a citation from SKD., and it is referring to the personal collection of Roth; how to distinguish both?)
Found that whenever Ratnam. is followed by a number or similar thing, it refers to this 2.037 (750 occurrences) and whenever it is mentioned to be from ŚKDr. it refers to 1.259 (660+ cases).
How do we distinctly mark the variants like this? This is not a singular case, there are few more like this having multiple expansions.
One simplest way coming to my mind is- Ratnam.[1] and Ratnam.[2]
@funderburkjim A very good finding and then a bad indication!
Good info is that-
And the bad indication is-
[I was wondering why would Thomas leave these pages, when all other pages are done faithfully.]
Now it is Jim's turn to bring this data into pwg.txt and pwg.xml in the current format.
Happy to say that I have done a sizeable work in cleaning up the PWG text for ls content (names and internal citations).
Presently the identification phase is going on for the 'unlisted items' and would be over shortly.
Then to promote myself from happy to a GLAD position, will split the data into lexical (gender-wise etc.) entries and comp. words as done in my version of MW and AP57. There also many "GROUP" entries in the book (just as in other works), which are to be marked thus in the data.
This would then be SOME work accomplished on the data.
Only remaining work would be to mark the abbreviations and expand them as a list. Germans are known for shortening literally almost any word in practice, even 2 or 3 lettered words [say, z. for zu & f. for für]. This would be the biggest work of all, if at all taken up by someone.
[I just wonder if @funderburkjim is coming any sooner to respond on this thread!!]
Another more important work, before the abbreviations, is to integrate the VN text (spanning over all the 7 volumes) into the Main text (add, insert, delete, modify operations) which is really essential.
[Many a times, one would just look at the first listed entry, without bothering to see other volumes' text-- as that portion is separately shown; thus missing the intended corrections/additions to the earlier listed Main text.]
There are 2217 Cologne Additions to PWG and 2174 out of the have a ? mark. So we are not sure what all the abbreviations actually stand for.
So 'Z. d. m. G. G. = ? [Cologne Addition]' from @funderburkjim
https://github.com/sanskrit-lexicon/csl-pywork/blob/master/v02/distinctfiles/pwg/pywork/pwgauth/pwgbib_input.txt is, without doubt,
Zeitschrift der Deutschen Morgenlandischen Gesellschaft
And is listed at https://nws.uzi.uni-halle.de/books?lang=de as
But plenty of cases where I'm totally lost and help needs to be asked. @Andhrabharati any idea what I'm speaking about?