sanskrit-lexicon / COLOGNE

Development of http://www.sanskrit-lexicon.uni-koeln.de/
18 stars 3 forks source link

Indische Sprüche preparation #360

Open funderburkjim opened 3 years ago

funderburkjim commented 3 years ago

Have begun work with @thomasincambodia on his digitization of Boehtlingk's Indishe Sprüche text.

The preparation work is being done in the repository: https://github.com/funderburkjim/boesp-prep.

The current work is focused on proofreading. Since the digitization was not done with the 'double-typing' of some of the later dictionaries, this proofreading is a big task. Thomas is focusing on the German text.

I will focus on developing a malleable light-weight xml form on the digitization, constructing a linkable version (for use with PWK), and proofreading the Sanskrit text.

Currently work is being done on volume 1 (of 3), and there are 2000+ 'verses' to proofread for Devanagari.

@sanskritisampada (perhaps along with me) will likely work on proofing the Devanagari. If others are interested in helping here, let me know.

The current text version of volume 1 is boesp-1_utf8.txt and there is an xml version boesp-1.xml.

Both of these use HK coding of Devanagari. I plan to make slp1, Devanagari, and IAST versions of the xml file in the near future.

If you would like to help with Devanagari proof-reading/corrections, let me know.

funderburkjim commented 3 years ago

A pdf of volume 1 (2nd edition) was provided by @gasyoun at https://github.com/sanskrit-lexicon/PWG/issues/37#issuecomment-898956536.

The download is about 120 MB.

This pdf could be used in proofreading the Devanagari.

funderburkjim commented 3 years ago

This issue is primarily just to make others aware of this work, and to request help from interested parties. Specific questions about how to help proofing the Volume 1 Devanagari should be made in the boesp-prep issues.

Andhrabharati commented 3 years ago

I can be of some help if the text is fully in Unicode, no encodings for any portion.

Andhrabharati commented 3 years ago

How can I pull this repo: https://github.com/funderburkjim/boesp-prep?

Andhrabharati commented 3 years ago

just opened the utf8 file.

the line breaks are just too many, wrt the print text (running text without columns); even the footnotes portions are also not as per the printed lines (2 columns).

atleast the main Sanskrit verse lines could be "redone" as per the book!

sanskritisampada commented 3 years ago

Yes ...I will work on this!

Sampada

On Wed, 1 Sep 2021, 03:26 funderburkjim, @.***> wrote:

Have begun work with @thomasincambodia https://github.com/thomasincambodia on his digitization of Boehtlingk's Indishe Sprüche text.

The preparation work is being done in the repository: https://github.com/funderburkjim/boesp-prep.

The current work is focused on proofreading. Since the digitization was not done with the 'double-typing' of some of the later dictionaries, this proofreading is a big task. Thomas is focusing on the German text.

I will focus on developing a malleable light-weight xml form on the digitization, constructing a linkable version (for use with PWK), and proofreading the Sanskrit text.

Currently work is being done on volume 1 (of 3), and there are 2000+ 'verses' to proofread for Devanagari.

@sanskritisampada https://github.com/sanskritisampada (perhaps along with me) will likely work on proofing the Devanagari. If others are interested in helping here, let me know.

The current text version of volume 1 is boesp-1_utf8.txt https://raw.githubusercontent.com/funderburkjim/boesp-prep/main/step0/boesp-1_utf8.txt and there is an xml version boesp-1.xml https://raw.githubusercontent.com/funderburkjim/boesp-prep/main/step0/boesp-1.xml .

Both of these use HK coding of Devanagari. I plan to make slp1, Devanagari, and IAST versions of the xml file in the near future.

If you would like to help with Devanagari proof-reading/corrections, let me know.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sanskrit-lexicon/COLOGNE/issues/360, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACTKJX7JH6EDWIHAAWTDLQLT7V6NFANCNFSM5DFKV7DA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

funderburkjim commented 3 years ago

atleast the main Sanskrit verse lines could be "redone" as per the book!

Look at the xml version. It has the verses appropriately formatted in terms of lines.

Our task will initially be restricted to proofreading the Sanskrit verses.

And Thomas will be focused on the German text in the other sections : translation' (D) sections, and in the Footnote (F section) and Corrections (V sections).

We will not be looking to proof the Sanskrit within the D,F, and V sections --- That would be too many cooks in the kitchen (i.e., we would be getting in the way of Thomas.).

I'll begin preparing an extract with just the verses (in different transliterations) today -- look for an issue message from boesp-prep repository.

gasyoun commented 3 years ago

This is a huge work indeed. I bow to your willingness to give it a new life.

Boehtlingk's Indishe Sprüche text.

1st or 2nd edition? It's the 2nd as I understand. Wonder how many links to 1st we have and if all of them can be found in 2nd? Hope yes.

proofreading the Sanskrit text

Should we compare the shlokas with https://www.wisdomlib.org/sanskrit/quote/mss/subhashita-9332 ?

It's the biggest work on subhashitas done after Indishe Sprüche

maha

maltenth commented 3 years ago

@gasyoun

1st or 2nd edition? It's the 2nd as I understand. Wonder how many links to 1st we have and if all of them can be found in >2nd? Hope yes.

pwk quotes the second edition (about 3000 times), which was published several years before pwk.


would be fairly easy to copy Sternbach's Sprueche and convert them to slp1 अंशवस्तव निशाकर नूनं कल्पितास्तरुणकेतकखण्डैः । येन पाण्डुरतरद्युतयो नः कण्टकैरिव तुदन्ति शरीरम् ॥

aṃśavastava niśākara nūnaṃ kalpitāstaruṇaketakakhaṇḍaiḥ | yena pāṇḍurataradyutayo naḥ kaṇṭakairiva tudanti śarīram ||

but I would still say it is better to do a traditional proofreading first.

Andhrabharati commented 3 years ago

@thomasincambodia

Now you may take up Boethlingk's two supplements' typing work as well, as I have posted good scans for you at https://github.com/sanskrit-lexicon/PWG/issues/37#issuecomment-911443035

Andhrabharati commented 3 years ago

@funderburkjim

Our task will initially be restricted to proofreading the Sanskrit verses.

And Thomas will be focused on the German text in the other sections : translation' (D) sections, and in the Footnote (F section) and Corrections (V sections).

We will not be looking to proof the Sanskrit within the D,F, and V sections --- That would be too many cooks in the kitchen (i.e., we would be getting in the way of Thomas.).

Here is what Thomas was saying about his work on proofing the German text- https://github.com/sanskrit-lexicon/PWG/issues/37#issuecomment-906097961

[... the proofreading of the German part of the first volume of SPR is finished ... The Sanskrit (Devanagari) typing has to be proofread, and there are also 88 translations into Greek (22 in the first volume) that have to be added. ...]

funderburkjim commented 3 years ago

@sanskritisampada and @Andhrabharati : In case you don't get notifications for boesp-prep repository, here is a link to an issue with a sample format for your proofreading.

funderburkjim commented 3 years ago

The link is https://github.com/funderburkjim/boesp-prep/issues/8

gasyoun commented 2 years ago

proofreading.

It's finalized, dear @funderburkjim ?

https://github.com/funderburkjim/boesp-prep/blob/main/step0/boesp.xml is the source, but not the final 4th step file, right? I want to get all the Indische Spruche from a single file to compare them with all the quotes from Ramayana and Mahabharata we where able to find out.

funderburkjim commented 2 years ago

Part 3 proofreading by Sampada not yet finished.

gasyoun commented 2 years ago

Part 3 proofreading by Sampada not yet finished.

Can I ask for what is there as of now? I want to show you what we've done with parallel shlokas, but I need your file for that, thanks!

funderburkjim commented 2 years ago

If what you want the latest revision of Indishe Spruche, you can clone https://github.com/funderburkjim/boesp-prep/ . The file is 'step0/boesp.xml'. There are also Devanagari and iast versions.

gasyoun commented 2 years ago

I asked above - so https://github.com/funderburkjim/boesp-prep/blob/main/step0/boesp.xml is the latest, right? All the steps after are already implemented in this XML, right?

funderburkjim commented 2 years ago

Yes, the proofreading of volumes 1,2 is included in boesp.xml. The file has all the shlokas, as well as the german translation and footnotes. It is the shlokas for volumes 1,2 that have been proofread. Sampada is working on proofreading the shlokas of volume 3 -- when she is finished, her corrections will be compared to Andhrabharati's corrections and boesp.xml will be revised.

The Devanagari within the footnotes has not been proofread.

gasyoun commented 2 years ago

Sampada is working on proofreading the shlokas of volume 3 -- when she is finished, her corrections will be compared to Andhrabharati's corrections and boesp.xml will be revised.

Understood.

The Devanagari within the footnotes has not been proofread.

And we will not ask @sanskritisampada even at a later stage?

funderburkjim commented 2 years ago

Definitely want to proofread the Devanagari in footnotes. Sampada may be involved; will discuss with her when the volume 3 work done. Not sure whether @Andhrabharati has also worked on this.

gasyoun commented 2 years ago

@funderburkjim can we run the python German pre-1900 language library we have used in the past on:

„Weshalb blickst du, Kokila, den Mangobaum unermüdlich an und lässt deinen lieblichen Gesang ertönen? Sieh, der wilde Bergbewohner schweift in der Nähe umher, und hat seinen Köcher voll von Pfeilen und den Bogen in der Hand.“ and similar?

Not sure whether @Andhrabharati has also worked on this.

He usually keeps silent until you ask. He is very productive indeed, yet a bit wild. But I mist say I'm his fan.

funderburkjim commented 2 years ago

I don't remember a 'pre-1900' German word list. Maybe @drdhaval2785 or @thomasincambodia or you can provide a reference?

Andhrabharati commented 2 years ago

@funderburkjim does this https://github.com/sanskrit-lexicon/CORRECTIONS/tree/master/dict-de_de-1901_oldspell_2014-02-21 remind you of the work done using the list some 8 years back, as @gasyoun mentions above?

Andhrabharati commented 2 years ago

Not sure whether @Andhrabharati has also worked on this.

He usually keeps silent until you ask. He is very productive indeed, yet a bit wild. But I mist say I'm his fan.

Andhrabharati did not touch this portion yet, though covered all other areas in BOESP.

In fact, in the initial days itself I asked for the whole text to read at once, but Jim has put my request at hold; accidentally I had recently seen the whole xml file to be present in the github repo; wonder when it was done.

As I am busy with another task, I may not be in a position to see this FNs part. Probably I might take it up (as a second proof) if and when Sampada does it as a first proof and Jim makes an update of the devanagari version of the full file again.

Andhrabharati commented 2 years ago

[Could be incidental-- but I always get surprised when @funderburkjim responds to other's posts almost immediately, while many of my posts do not get any response from him for months together.]

drdhaval2785 commented 2 years ago

I feel the issue is that others provide a chewable bite size, and you provide a different dish altogether. Takes some time to digest.

From my own experience, your changing of line numbers or markup makes processing extremely difficult. Programming takes a lot of time. Sometimes to an extent that reading your file and making manual correction is sometimes better than writing code to integrate your changes.

Note that I have not recently worked with your files, so my opinion may be a bit dated.

So, it seems that the delay in response has something to do with processing effort, and not the value of your contribution.

Andhrabharati commented 2 years ago

From my own experience, your changing of line numbers or markup makes processing extremely difficult. Programming takes a lot of time.

I almost maintain a consistent format, so far as line breaks are concerned; and as I suggested in another post lately, a simple (re-)work in Cologne files by the team (one time effort) makes my files fully usable quite easier, of course if willing to do so.

Andhrabharati commented 2 years ago

Note that I have not recently worked with your files, so my opinion may be a bit dated.

No, I haven't changed my working style; your observation is quite valid still.

maltenth commented 2 years ago

@funderburkjim

I don't remember a 'pre-1900' German word list. Maybe @drdhaval2785 or @thomasincambodia or you can provide a reference?

I compiled a list of German words in old spelling from pw around 2004 and remember discussing such a list briefly with Jim. It contains 2679 entries, old spelling on the left followed by double hyphen, and modern spelling on the right. Some of the entries could be removed as the spelling didn't change 28 entries are marked with a § sign. Those need further investigaton.

There are words, like "Buhlerverhältniss -- Buhlerverhältnis", where the spelling is corrected but the word totally out of use in modern German (I googled it)

The reason for the compilation was to enable searching for a German word in pw using contemporary spelling.

Here is the file:

pwdeu_alt-neu_utf8.txt

Andhrabharati commented 2 years ago

While going through PWG and pwk texts, I've felt that the spellings of many words had a kind of "english influence"; and SCH has changed all such pwk wordings sincerely to "germanisch spellings", as I understood.

Probably this is a sub-section of what Thomas is saying, as old german spelling & modern german spelling.

sanskritisampada commented 2 years ago

Hi all Yes, I can definitely do the Devanagari corrections after I complete boesp3. Was help up last few weeks with some personal work, hence the delay.

Sampada

On Wed, May 4, 2022 at 11:40 PM funderburkjim @.***> wrote:

Definitely want to proofread the Devanagari in footnotes. Sampada may be involved; will discuss with her when the volume 3 work done. Not sure whether @Andhrabharati https://github.com/Andhrabharati has also worked on this.

— Reply to this email directly, view it on GitHub https://github.com/sanskrit-lexicon/COLOGNE/issues/360#issuecomment-1117961928, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACTKJX4ZBCCVX6KG3VINQK3VILVEVANCNFSM5DFKV7DA . You are receiving this because you were mentioned.Message ID: @.***>

--

"Faith,more faith!Faith in your possibilities,faith in the Power that is at work behind the veil,and the offered guidance." - Sri Aurobindo

gasyoun commented 2 years ago

does this https://github.com/sanskrit-lexicon/CORRECTIONS/tree/master/dict-de_de-1901_oldspell_2014-02-21 remind you of the work done using the list some 8 years back, as @gasyoun mentions above?

That's the one

Probably I might take it up (as a second proof) if and when Sampada does it as a first proof and Jim makes an update of the devanagari version of the full file again.

That would be amazing

[Could be incidental-- but I always get surprised when @funderburkjim responds to other's posts almost immediately, while many of my posts do not get any response from him for months together.]

As there are dozens of points always together, so the first task is to understand what you have done. Sometimes it takes months to do so, because you do a lot, but not always add the documentation.

I feel the issue is that others provide a chewable bite size, and you provide a different dish altogether. Takes some time to digest.

Fully agree.

a simple (re-)work in Cologne files by the team (one time effort) makes my files fully usable quite easier

If it's as simple as you describe, maybe we should discuss that on a call in let's say August?

The reason for the compilation was to enable searching for a German word in pw using contemporary spelling.

@thomasincambodia thanks, would love think about integrating something of this kind to the website one day - so that the modern German spelling could be used as well.

While going through PWG and pwk texts, I've felt that the spellings of many words had a kind of "english influence"; and SCH has changed all such pwk wordings sincerely to "germanisch spellings", as I understood.

Can you give 3 samples, please?