Open gasyoun opened 3 years ago
Probably what I did was to go sequentially through the dictionary, and collect
all the headwords with samAsa children.
Look at the 'su' group with 'su-gaR'. Compare to page 1222.
Then see the 'su-cakzas' group on p. 1223.
The compounds are 'children' of different entries for 'su'.
Probably the other 'su-' groups are similarly explained.
For some purposes it would make sense to aggregate these 'su-' compounds.
By the way, I like the display above -- Is that one you developed?
was to go sequentially through the dictionary, and collect all the headwords with samAsa children.
Right, that's what it seems. But seems that these subgroups appear only in upasargas.
What would be required to have a united version of them?
By the way, I like the display above -- Is that one you developed?
No, it's your file. https://github.com/funderburkjim/MWderivations/blob/master/compounds/compounds.html
What would be required to have a united version ?
From compounds.txt, a program could create compounds-united.txt.
This would replace all the ':su:' lines with just one ':su:' line
And similarly for all the other 'prefixes'.
From compounds.txt, a program could create compounds-united.txt.
Is there any idea how to automate it? Because if you update the source (and after the AB changes are implemented), my non-smart gluing will unglue again.
The parent may be marked as a VERB -- 1247 of these. Clearly the children in such cases are not compounds, but the current (H3) markup of the children is the same as for samasas.
They are not marked in anyway now? Should I remake the formatting of dhātu entries, so they do not give false positives here?
H3 headwords can also have H4 children, but this table ignores these
Why? Because not encoded in easy to grasp manner?
No sandhi, easy:
akṣara+kara
= akṣarakara
Sandhi involved:
akṣarā@kṣara
= akṣarākṣara
But does entry:
akṣāra
+lavaṇa +lavaṇā@śin
+lavaṇā@śin
= akṣāralavaṇāśin
akzAralavaRa
How to automate ?
A program is needed to create compounds_united.txt from compounds.txt; let's call that program 'compounds_united.py' (not yet written).
Then, if compounds.txt is revised, we can run the program to update compounds_united.txt.
This would be part of a larger redo script. It would come after the steps to update compounds.txt, as described in https://github.com/funderburkjim/MWderivations/blob/master/compounds/readme.txt
compounds.txt depends on step4/all.txt. And there is a redo script for step4.
etc. etc. That's the way that things can be automated. Based on recent look at MWderivations, it looks fairly straightforward to write a 'master redo script' that would update everything that needs to be updated in MWderivations. Looks like MWderivations is fairly well organized to be updateable.
let's call that program 'compounds_united.py' (not yet written).
Let's call it for life. After magic I've seen it should not be rocket science, thanks.
This would be part of a larger redo script.
That is why I ask for you and do not just join them on my end.
Looks like MWderivations is fairly well organized to be updateable.
Please, it could reused in that case in the Reverse Sanskrit Dictionary
in that case, as it would be still alive. Esp. after compounds_united.py
Looks like MWderivations is fairly well organized to be updateable.
Please give compounds another chance @funderburkjim ))
MW has ghrāṇa—cakṣus
But your file, Jim, has 0010:ghrāṇa:+cakṣuś +ja +tarpaṇa +duḥkha-dā +pāka +puṭaka +bila +śravas +skanda ghrāṇe@ndriya
@funderburkjim
1) can the .txt use TAB instead of SPACE as a limitator? Otherwise because of some entries containing space I get in trouble.
2) we have a normalised list of headwords. Can we have a normalised list of samasa elements and samasa output as well? What would be the way to solve it?
3) the list contains around 111k samasas and some, like a-prameya
missing in the list. Any reason why?
@funderburkjim I would love to continue the research, but impossible without your help.
If I search for
:ati:
there is only one instance. Suppose I want to analyseatikopasamanvita
.If I search for
:su:
there are 12 different ones with 1723 subentries withsu-
. Why there are split, @funderburkjim ?Because of anusvāra? Even if split originally, for the purpose of analysis does it makes sense to keep them apart?