funderburkjim / kosha-dev

Develop xml and html for anekArthaka and samAnArthaka Sanskrit dictionaries
1 stars 1 forks source link

Time to put ABCH in the Cologne main list? #20

Closed drdhaval2785 closed 7 months ago

drdhaval2785 commented 9 months ago

Is the time ready or not yet @funderburkjim ? I have updated the content of abch.txt file to the latest (after some major corrections).

funderburkjim commented 9 months ago

I'll look at abch in this repository soon, probably this week, when my mind needs to take a break from PW corrections.

funderburkjim commented 9 months ago

applicable dtd prepared

one.dtd prepared so validation is proper. Incidentally, the form is almost the same for abch and anhk.

funderburkjim commented 9 months ago

display questions (abch)

funderburkjim commented 9 months ago

Another display comment

The display does not link to scan in cases where the scan page is same as previous. See the 2nd item in search for 'fzaBa' for example. For this dictionary display, I think the 'printed book page' message should appear for each entry. As of this note, I'm not sure why the [Printed book page X] message is skipped here. My hunch is some choice made in websanlexicon code.

drdhaval2785 commented 9 months ago
  • <eid> is not displayed -- This seems to be an identifier for <syns>. The value goes from 1 to 4598.

Fine. Nothing important as of now. In case we link synonym sets across dictionaries at a later stage, it will be the unique identifier.

  • The <L> (within abch.txt) also is not displayed. Value goes from 1 to 1965. What does it mean?

It is L number, as in other CDSL dictionaries. An identifier for text chunk.

  • Sometimes a partial verse is displayed with no clue as to the verse. Example L=3.

    • this could be remedied by showing, e.g. (29) in abch.txt:

      fzaBo vfzaBaH SreyAYSreyAMsaH syAdanantajidanantaH . (29)

      (if changed, the change would be to abch1.txt -- the devanagari version.)

I would prefer to keep the abch1.txt as it is. The number can be created programmatically from previous verse number.

I would prefer the citation to be something like (KANDa.varga.upavarga.verse)

KANDa can be deciphered from ;k{XYZ} varga (if at all present) can be deciphered from ;v{XYZ} upavarga (if at all present) can be deciphered from ;vv{XYZ} verse can be calculated from previous verse.

The reason for this choice is that it is not at all necessary that the verse numbers will be continuous throughout the text. Some koshas will restart from 1 when kANDa changes. Some will restart at varga change, some will restart at upavarga change, and some will not reset at all. To take care of this idiosyncracy from the start, we can devise our system to show as desired above.

Over and above numbering issues, the proposed method will also show the user the some ontologic data too e.g. the word is related to the earth and relates to plant.

funderburkjim commented 9 months ago

v5

Here is a sample of v5 output:

image
funderburkjim commented 9 months ago

v5 points of interest

@drdhaval2785 I look forward to your comments.

drdhaval2785 commented 9 months ago
  • eid is shown as the 'syngroup' value. This term seems descriptive, but some other term may be preferable

Synset is universally accepted term. It is used ever since WordNet came into existence. Synset = synonym set.

  • printed book page is printed for each entry, even if the same page as previous entry.

Good.

  • when needed, the verse number is shown in parentheses (e.g. (26) for id=3). This is part of abch.txt. Code is info_1 function in prep/addinfo.py.

Very nice.

  • k,v,vv label. You should check that my derivation is correct. The code is info_3 function of prep/addinfo.py. These labels are part of abch.txt, as derived from your abch1.txt.

I will check, but derivation of given example seems fine to me on first look.

  • the gender is shown only when different from the gender of the preceding item of the syngroup.

I understand that showing gender along with every headword may look clumsy for the user in the frontend.

I have another suggestion. We can club items according to gender. It will have a downside that the words may not be in the same order as in the verse, but this downside comes with an upside that the user will be able to see synonyms in different gender sets. Quite helpful. And for seeing the headwords in the order given in the verse, he can anyhow refer to the verse.

funderburkjim commented 9 months ago

see synonyms in different gender sets. Quite helpful

Why helpful? What 'end-user' application(s) would take as input some collection of synset-gender groups?

drdhaval2785 commented 9 months ago
funderburkjim commented 9 months ago

minor revision

Use word 'synset' instead of 'syngroup' in display.

drdhaval2785 commented 7 months ago

Issue resolved. Great.