Open funderburkjim opened 4 years ago
Two entry points were added to the keydoc system of this hwnorm2 repository. There are two dictionary-specific files in the keydoc/distinctfiles directory.
Currently there are two pw files in distinctfiles:
vikar kar
occurs because in the pw root kar
there is <div n="p">— Mit {#vi#}
Thus, a search for 'vikar' should result in a listing of the entry 'kar'.
vikf kar
also occurs. This is because, in other dictionaries, 'kar' is spelled 'kf', so
in such dictionaries the prefixed form would be 'vikf'. There is also an mw file in distinctfiles: mw_norm_extra.txt.
The current records in this file
also relate alternate prefixed verb spellings for prefixed verbs appearing in mw to the virtual prefixed verb spellings for pw. There are about 1500 such records. For example,
vikf vikar
relates the mw headword 'vikf' to the virtual (pw) headword 'vikar'.
The next comment shows some examples of these features using the current research display.
These examples are from the [dalglob1]() research display; this display also works in local installations.
We see that vikar appears in 6 dictionaries; In 5 of them, it is spelled 'vikf'. In pw, the search leads to the entry for 'kar'.
Note 1: Why is pwg dictionary absent for 'vikar'? The answer is that we haven't yet
done the work for prefixed verbs for pwg to get pwg_norm.txt and pwg_keydoc_input.txt. There is similar markup in pwg (<div n="p">— {#vi#}
) which
should permit a prefixed verb enhancement for pwg similar to the one we are currently describing for pw.
Note 2: If we click the 'pw' button, the bottom panel of the display will show the entry for 'kar'. If we click the 'ap90' button, the bottom panel will show the entry for 'vikf'.
This gives exactly the same coverage as the previous search for 'vikar', as it should!\
If you look closely at 'kar' in pw (or ccs), you will see that one entry has the verb form 'kirati'. In 'mw' this is associated with the spelling 'kF' (not 'kf').
One thing we could do is to consider 'kF' as another alternate spelling for 'kar' (in pw_norm_extra.txt). This would have the impact that, in mw, 'kf kF' would be in the same document.
I'm currently mildly in favor of this, and similar other, linkages. Does anyone have an opinion on this?
9000 such examples
We have 10500 MW sopasarga roots, can we use that data?
verb spellings for prefixed verbs appearing in mw to the virtual prefixed verb spellings for pw. There are about 1500
Seems like you've got the right one.
'kF' as another alternate spelling for 'kar'
Could not it get us in trouble? Like more false variants than goodness, that it can bring.
I'm currently mildly in favor of this, and similar other, linkages.
Let's link. Than we will see what wrong links are born out of it. I'm all for it. I was waiting 7 years for this post.
We have 10500 MW sopasarga roots, can we use that data?
Yes, in fact that data WAS used.
I've copied the working directory used in the PW preverb study into this repository as 01-pwverbs.
The readme file therein describes the steps.
A couple of files that might be of interest are:
The readme file therein describes the steps.
So detailed, as usual. My ideas.
Good for dhatu:
DHĀTUP
onomatop.
Sautra
Wurzel
caus.
v. l.
perf.
med.
mit
von
Can't be dhatu:
adj.
m.
n.
f.
interj.
indecl.
pron.
nom.
@funderburkjim did some of my hooks helped?
There probably is some overlap in your 'hooks' and in the hooks I used for filtering verbs.
Based on the part of your table above, there are some verbs that my filter missed, such as line 4 aqq (slp1 spelling).
Your table also has some pretty definite NON-VERBS (example line 22 andha)
Question: What is your column A ? The numbers there don't correspond to the Cologne IDS for PW.
Suggestion: Provide a text file of headwords column B (devanagari) OR column E (HK transliteration). If you know a way to filter out definite non-verbs, please do exclude them.
I can develop diffs of your headword list with my PW verb headword list (such as in preverb1.txt).
This will provide a source of verbs that you found, but that I missed; and vice-versa.
Changes were made so that prefixed verbs in pw dictionary may be accessed via the keydoc database. This note describes the changes in some detail, so that similar work may be done for other dictionaries.