Open Kentoseth opened 3 years ago
ok, I think it's a great idea, I will work on it.
Salam, I added a new feature to Alyahmor library, to generate tags for a word form
I added a option to get more details:
generator.generate_forms( word, word_type="noun", indexed=True, details=True)
for example:
>>> import alyahmor.genelex
>>> generator = alyahmor.genelex.genelex()
>>> word = u"كِتِاب"
noun_forms = generator.generate_forms( word, word_type="noun", indexed=True, details=True)
>>> noun_forms
[{'vocolized': 'استعمل', 'semi-vocalized': 'استعمل', 'segmented': '-استعمل--', 'tags': '::'},
{'vocolized': 'استعملي', 'semi-vocalized': 'استعملي', 'segmented': '-استعمل--ي', 'tags': ':مضاف:'},
{'vocolized': 'استعملِي', 'semi-vocalized': 'استعملِي', 'segmented': '-استعمل--ي', 'tags': ':مضاف:'},
{'vocolized': 'استعملكِ', 'semi-vocalized': 'استعملكِ', 'segmented': '-استعمل--ك', 'tags': ':مضاف:'},
{'vocolized': 'استعملكَ', 'semi-vocalized': 'استعملكَ', 'segmented': '-استعمل--ك', 'tags': ':مضاف:'},
{'vocolized': 'استعملكِ', 'semi-vocalized': 'استعملكِ', 'segmented': '-استعمل--ك', 'tags': ':مضاف:'},
{'vocolized': 'استعملكُمُ', 'semi-vocalized': 'استعملكُمُ', 'segmented': '-استعمل--كم', 'tags': ':مضاف:'},
....]
wasalam,
This is a really great improvement.
Is there a way to slim down the results or is it meant to output so many different forms at once?
ٍSalam, Thank you. Do you mean reduce the number of generated word forms?
You can request to have a specific form according to given affixes like:
>>> import alyahmor.genelex
>>> generator = alyahmor.genelex.genelex()
>>> word = u"كِتِاب"
>>> generator.generate_by_affixes( word, word_type="noun", affixes = [u"بال", u"", u"ين", u""])
['بِالْكِتَِابين']
>>> generator.generate_by_affixes( word, word_type="noun", affixes = [u"وك", u"", u"ِ", u""])
['وَكَكِتَِابِ']
>>> generator.generate_by_affixes( word, word_type="noun", affixes = [u"و", u"", u"", u""])
['وَكِتَِاب']
Or you can demand only reduced forms:
>>> import alyahmor.genelex
>>> generator = alyahmor.genelex.genelex()
>>> word = u"كِتِاب"
>>> noun_forms = generator.generate_forms( word, word_type="noun", indexed=True)
>>>noun_forms
{u'أككتابة': [u'أكَكِتَِابَةِ', u'أكَكِتَِابَةٍ'],
u'أوككتابة': [u'أَوَكَكِتَِابَةِ', u'أَوَكَكِتَِابَةٍ'],
u'وكتابياتهم': [u'وَكِتَِابياتهِمْ', u'وَكِتَِابِيَاتُهُمْ', u'وَكِتَِابِيَاتِهِمْ', u'وَكِتَِابِيَاتُهِمْ', u'وَكِتَِابياتهُمْ'],
u'وكتابياتهن': [u'وَكِتَِابياتهِنَّ', u'وَكِتَِابياتهُنَّ', u'وَكِتَِابِيَاتِهِنَّ', u'وَكِتَِابِيَاتُهِنَّ', u'وَكِتَِابِيَاتُهُنَّ'],
u'وللكتابات': [u'وَلِلْكِتَِابَاتِ', u'وَلِلْكِتَِابات'],
u'أبكتابتكن': [u'أَبِكِتَِابَتِكُنَّ'],
u'أبكتابتكم': [u'أَبِكِتَِابَتِكُمْ'],
u'أكتابياتهن': [u'أَكِتَِابياتهِنَّ', u'أَكِتَِابِيَاتِهِنَّ', u'أَكِتَِابياتهُنَّ', u'أَكِتَِابِيَاتُهُنَّ', u'أَكِتَِابِيَاتُهِنَّ'],
u'فكتاباتهم': [u'فَكِتَِاباتهِمْ', u'فَكِتَِابَاتُهُمْ', u'فَكِتَِابَاتُهِمْ', u'فَكِتَِاباتهُمْ', u'فَكِتَِابَاتِهِمْ'],
u'بكتابياتكن': [u'بِكِتَِابِيَاتِكُنَّ', u'بِكِتَِابياتكُنَّ'],
....
}
wasalam,
I will close this ticket now as the feature has been implemented.
جزاك الله خير
Salam, You give me an idea, I think I will implement some variant of generate forms function:
wasalam,
If you are considering improving Alyahmor further, then my suggestions are:
Nouns
^ Something that confuses me in these inflected forms is the plural of كتاب is كُتُبٌ (book and books) but the female plural of كِتابات means "writings;essays". Is كِتابات the female plural equivalent of كُتُبٌ or is it the female plural of another word that isn't كتاب?
Source for noun lookups: http://www.aratools.com/
I think for verbs you already display everything (except translations), so verbs just need filters so that they can display results like this website(until participles):
https://cooljugator.com/ar/%D8%B9%D9%85%D9%84
So verbs would have the: tense/mood/participle, Arabic with harakat, English translation
This is a lot of work, so please only consider these improvements if you have the capacity. These ideas will be useful for Arabic learners as a dictionary reference, instead of using Hans Wehr.
wasalam, Salam,
If you are considering improving Alyahmor further, then my suggestions are:
1. Nouns * Showing singular, dual and plural (with translations) - I think you referred to them as inflected forms above
For translation, I think that will be another project,
* Identifying the root
Alyahmor uses Arramooz dictionary project, we can add roots.
* Identifying the Part of Speech
I propose the tags as attributes about the word form, I use Mysam project to generate the POS.
^ Something that confuses me in these inflected forms is the plural of كتاب is كُتُبٌ (book and books) but the female plural of كِتابات means "writings;essays". Is كِتابات the female plural equivalent of كُتُبٌ or is it the female plural of another word that isn't كتاب?
The word كتابات is just an example, it's not a plural form of كتاب.
Source for noun lookups: http://www.aratools.com/
Ok
- Verbs
I think for verbs you already display everything (except translations), so verbs just need filters so that they can display results like this website(until participles):
We have another project to handle verb conjugation: Qutrub project , on github repo
So verbs would have the: tense/mood/participle, Arabic with harakat, English translation
no translation
This is a lot of work, so please only consider these improvements if you have the capacity. These ideas will be useful for Arabic learners as a dictionary reference, instead of using Hans Wehr.
I hope to do this
Salam,
In this lib, the output currently looks like:
Instead of having just the 0/1(True/False), can you display the results. For example the verb 'ضَرَب', can you display the outputs of the past/future/imperative/passive
Another example for a noun 'كتاب', to show the output of the single/dual/plural/broken-plural
Your other library https://github.com/linuxscout/alyahmor generates the verb/noun forms already, but it doesn't show the past/future/imperative for verbs and it doesn't show some of the noun options as well.
If this feature request is more suited to https://github.com/linuxscout/alyahmor , please add it there instead of this library.