Open funderburkjim opened 4 years ago
Jim, thanks for documenting in detail the issue with 443k spellings. GenerallyI do not understand where this UI will fit, as now we have so many different path to go. But the issues at the end of 2nd video, like pitA
and pitf
- have not we solved them already in the past in a different place?
nṛsiṃhaācārya
does look in your video anti-sandhi.
(nṛsiṃhaācārya is an alternate of narasiṃha.) narasiṃha or nṛsiṃha ācārya
In 1st video you give alternate headword for MW based on that ACC gives narasiṃha
and nṛsiṃha ācārya
as synonyms.
In 2nd video MW gives guru
and gurvi
, but you do not use this connection for other dictionaries, or they just do not have a gurvi
entry or subentry that can be used?
where this UI will fit
This UI is currently just for research purposes. The research questions:
Assume that a document D in dictionary X is determined by headwords with spellings H1,H2,.. in X. Then the local search terms L1,L2,... for D currently include:
The global search terms G1,G2,... for a document D in dictionary X take into account other dictionaries.
nṛsiṃhaācārya does look in your video anti-sandhi.
I agree. This looks like a bug in acc. In fact all the following instances look to be similar errors:
13 matches for "aa" in buffer: acc_hwextra.txt
68:<L>1783.1<k1>kOSikAditya<k2>kOSikAditya<type>alt<LP>1783<k1P>AdityaAcArya
104:<L>2568.1<k1>udayakaraAcArya<k2>udayakara AcArya<type>alt<LP>2568<k1P>udayana
179:<L>4657.1<k1>kfzRamBawwa<k2>kfzRamBawwa,<type>alt<LP>4657<k1P>kfzRaBawwaArqe
210:<L>5684.1<k1>gaReSvaraAcArya<k2>gaReSvara AcArya<type>alt<LP>5684<k1P>gaReSadEvajYa
401:<L>11100.1<k1>nfsiMhaAcArya<k2>nfsiMha AcArya<type>alt<LP>11100<k1P>narasiMha
496:<L>13957.1<k1>SuBaMkara<k2>SuBaMkara<type>alt<LP>13957<k1P>pragalBaAcArya
778:<L>22017.1<k1>dIkzita<k2>dIkzita<type>alt<LP>22017<k1P>vAsudevaaDvarin
814:<L>23353.1<k1>veNkawanATa<k2>veNkawanATa<type>alt<LP>23353<k1P>veNkawaAcArya
815:<L>23359.1<k1>veNkaweSa<k2>veNkaweSa<type>alt<LP>23359<k1P>veNkawaAcArya
903:<L>26044.1<k1>SrInivAsatIrTa<k2>SrInivAsatIrTa<type>alt<LP>26044<k1P>SrInivAsaAcArya
951:<L>28306.1<k1>darSanAcArya<k2>darSanAcArya<type>alt<LP>28306<k1P>sudarSanaAcArya
952:<L>28306.2<k1>darSanArya<k2>darSanArya<type>alt<LP>28306<k1P>sudarSanaAcArya
956:<L>28551.1<k1>viSvarUpa<k2>viSvarUpa<type>alt<LP>28551<k1P>sureSvaraAcArya
I think all the 'aA' in 'k1' or 'k1P' should be changed to 'A'.
@drdhaval2785 agree?
After the global document search term step mentioned above, there is one more step (keydoc2.txt) which revises the local document definitions.
An abstract statement of this process might be: For a given dictionary X, merge all documents which have a common search term.
The example of Burnouf with guru and gurvI might help. Before the global search term step, the relevant items (in keydoc_norm.txt) for Burnouf shows two documents:
These documents are, at this stage, unrelated.
After the global merge step, the relevant items (in keydoc_,merge.txt for burnouf) still shows two documents, but with additional search terms.
The last step merges these two documents, so now there is only 1 combined document in burnouf (keydoc2.txt):
The reason these are merged is because there are common spellings in the two merged documents: In fact, in this case, 'guru' and 'gurvI' are both common search terms in the merged documents.
So that is how the new, two-headword, document occurs in Burnouf.
The last step merges these two documents, so now there is only 1 combined document in burnouf (keydoc2.txt):
Now let's think how it can and should live together with simple
. And let's at least document what kind of relations are given in each dictionary between words. There are antonyms in GRA, for example and we have never even tried to markup them.
Or another approach. giri
is based on guru
, that is based on root gir
as per Kossowich, but Wilson gives E. gṝ
.
@drdhaval2785 agree? I agree
https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/sample/dalglob1.php is left. But there was a more modern version of it anyway, no, @funderburkjim ?
This repository is an offshoot of the hwnorm1 repository.
The idea is to define a dictionary document by a collection of intrinsic dictionary headwords, then to allow access to such documents by the intrinsic headword spellings as well as alternate and normalized spellings. The term 'keydoc' (a document defined by headword keys) is one way to refer to this notion; and it is currently represented by a database with the beautiful name keydoc_glob1 (global keydoc database).
dalglob1 is a display that uses the new database. The database does not currently affect other displays.
There are two Youtube videos: