sanskrit-lexicon / mw-dev

Development version of MW dictionary, to collaborate with Andhrabharati
1 stars 0 forks source link

Discussion with Andhrabharati to smoothen out processes for major revisions #5

Open drdhaval2785 opened 1 year ago

drdhaval2785 commented 1 year ago

This issue is dedicated to find out whether I am able to use the data which as been handed over by @Andhrabharati in https://github.com/sanskrit-lexicon/mw-dev/issues/2, and generate some useful output at the end.

drdhaval2785 commented 1 year ago

As Andhrabharati has decided that he will separate the info tag in separate file, I would expect that he would maintain line numbers, without any exception. Otherwise, I would not be able to use the info data separated in a different file.

Andhrabharati commented 1 year ago

Pl. look at the sample file of page 1 that I had posted earlier, @drdhaval2785 ; I did maintain the line numbers and have put the new insertions (that would make the line numbering in the file to change) in a separate file.

drdhaval2785 commented 1 year ago

I am already looking at it currently. That is why the question.

drdhaval2785 commented 1 year ago

I checked with the usability of the data sent by @Andhrabharati. It seems that I would be able to handle the same. There is a small program is written - https://github.com/sanskrit-lexicon/MWS/blob/master/mwsissues/issue148/integrate_changes_ab.py Output is - https://github.com/sanskrit-lexicon/MWS/blob/master/mwsissues/issue148/MW_new.txt

This output was compared with the IAST version of mw.txt file. The corrections by @Andhrabharati is incorporated, and info tag details and slp1 version of s1 tag was restored (regenerated).

This suggests that @Andhrabharati may go ahead with his suggested work.

drdhaval2785 commented 1 year ago

I do not have any idea why the commas and semicolons were inserted in the data by @Andhrabharati ? I presume that it was done to make the data look more like the printed book, right?

drdhaval2785 commented 1 year ago

Only request to Andhrabharati is to intimate me if he changes / deletes some data blocks.

Andhrabharati commented 1 year ago

I do not have any idea why the commas and semicolons were inserted in the data by @Andhrabharati ?

There is something called 'consistency' that need to be followed in one's work, and I presume that it is a universally acceptable norm.

And one may look at at my Obs.1-c under https://github.com/sanskrit-lexicon/MWS/issues/145#issuecomment-1364689233

If for some reason, one needs to join two split L-entries of MW, the ending semi-colon (that is removed for unknown reason in the course of MW mark-ups; I have checked that this is very much present in the earlier versions) would come handy; and, of course, it is one of the grammar rules one learns in primary school that any sentence should end in a full-stop, question-mark or exclamation-mark; and a sub-sentence should end in a semi-colon.

I presume that it was done to make the data look more like the printed book, right?

Unless one wants to remove all the commas and semi-colons at these places in the name of consistency, it is the plain and best way to follow the print version.

Andhrabharati commented 1 year ago

This suggests that @Andhrabharati may go ahead with his suggested work.

Only request to Andhrabharati is to intimate me if he changes / deletes some data blocks.

I was just pondering if I should take the ver. 3 path suggested by @funderburkjim, as he sounded that my present work may not be integrated into the 'present' CDSL text; also, it gives me ample freedom to make all possible modifications-- obviously for good, and not fanciful or 'chaotic', reasons!!

And I am awaiting for more thoughts from him, to decide and freeze the modification 'points'. [Of course, my current work could fit well in the present CDSL format itself-- as that was the approach I had chosen at the beginning of this exercise.]

funderburkjim commented 1 year ago

You may want to remake the displays for the revised MW frequently also the xml validity check. This will help distinguish 'major' from 'minor' changes in the new version (here 'major' and 'minor' are in reference to current assumptions built into the construction of displays).

This remake should be done in such a way that it does not modify any of the v02 parts of the production repositories (namely, csl-orig, csl-pywork, csl-websanlexicon)

One convenient way to accomplish an independent environment might be to have a separate mw-dev repository that would initially be the current 'cologne/mw' folder of a local installation.

Andhrabharati could post versions to mw-dev/orig/mw.txt Then, a 'redo' script in mw-dev/pywork could remake mw.xml, check validity of mw.xml, and do other steps to update mw-dev/web. And the resulting displays within mw-dev/web would reflect the changes.

@drdhaval2785 This kind of setup should simplify management of this project. It would also permit me to kibitz (add my two-cents worth) when needed. What do you think?

drdhaval2785 commented 1 year ago

https://github.com/sanskrit-lexicon/mw-dev

This is the development repository. We can work in that repository for corrections as envisaged by @Andhrabharati .

gasyoun commented 1 year ago

it is the plain and best way to follow the print version.

For sure. MW should get back to the roots.