Open drdhaval2785 opened 8 years ago
Per Gasyoun
Not quite. a- is a prefix (=preverb). A real one. prefixoid - (linguistics) A wordinitial segment that does not have all characteristics of a prefix. Pseudoprefix - it's never a real one. /aMSa from the above list is a prefixoid - many samasas start with it. Same would be akzi, aDara, aneka and hundreds more.
Let me explain, @drdhaval2785 and let me @funderburkjim if I'm clear enough. There are prefixes and prefixoids. I would love to know the stats - which prefixes are more popular. That vi (2450) is 5 times less used than sam (12820).
vinAma vi nAma
vinAmaka vi nAmaka
vinAmikA vi nAmikA
vinAyaka vi nAyaka
vinAyakacaturTI vi nAyaka caturTI
vinAyakacaturTIvrata vi nAyaka caturTI vrata
vinAyakacarita vi nAyaka carita
samunnamana sam unnamana
samunnaya sam unnaya
samunnasa sam unnasa
samunnAha sam unnAha
samunnidra sam unnidra
samunmajj sam un majj
samunmiSra sam unmiSra
niryUza nir yUza
niryUha nir yUha
niryogakzema nir yoga
niryola nir yola
nirlakzaRa nir lakzaRa
nirlakzya nir lakzya
nirlajja nir lajja
nirlajjatA nir lajja
nirlayanI nir layanI
nirlavaRa nir lavaRa
upasfjya upa sfjya
upasfta upa sfta
upasftavat upa sfta vat
upasftya upa sftya
upasfpta upa sfpta
Everything that is left after we filter off prefixed words (words that have an upasarga, प्र, परा, अप, सम्, अनु, अव, निस्, निर्, दुस्, दुर्, वि, आ (आङ्), नि, अधि, अपि, अति, सु, उत् /उद्, अभि, प्रति, परि तथा उप or one of it's variations) - and longer than 1 element (not sure what a word like yuga
is doing in this samasas file, but such words we do not need either).
So
cittaBU citta BU
cittaBUmi citta BUmi
cittaBeda citta Beda
cittaBrama citta Brama
cittaBramacikitsA citta Brama cikitsA
cittaBrAnti citta BrAnti
citrakuRqala citra kuRqala
citrakuzWa citra kuzWa
citrakUwa citra kUwa
citrakUwamAhAtmya citra kUwa mAhAtmya
citrakUwayAtrA citra kUwa yAtrA
Have citta
and citra
as prefixoids - many words are built using such a model.
vrAtamaya vrAta maya
vrIhimaya vrIhi maya
Sakamaya Saka maya
Saktimaya Sakti maya
SaNkAmaya SaNkA maya
ekaSilA eka SilA
ekaSIla eka SIla
ekAntaSIla ekAnta SIla
evaMSIla evaM SIla
evaMSIlasamAcAra evaM SIla
kamalaSIla kamala SIla
karmaSIla karma SIla
kAkaSilA kAka SilA
Have maya
and SIla
as suffixoids.
Such list of building blocks (with up to 3 samples per case) would be an appendix to the Reverse dictionary, that could be used by linguists.
Per https://github.com/drdhaval2785/samasasplitter/issues/2#issuecomment-166070828 @gasyoun wants a list of prefix and suffix in MW for his purpose. Try to make a small script for the same. May come useful for the splitter also.
e.g.
a
orA
would be prefixoids in most of the cases. Right now we are ignoring the single letter parts. But I guess we can allow them in prefixes.Code modification is not that easy.