CAMeL-Lab / camel_tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
MIT License
413 stars 73 forks source link

Rewrite rule broken #138

Open christios opened 10 months ago

christios commented 10 months ago

https://github.com/CAMeL-Lab/camel_tools/blob/b496501590ee0753eeb3686037fffeb12f4c80d2/camel_tools/morphology/utils.py#L76

Hello,

This CAPHI rewrite rule seems to be broken. I changed to u'u\\_w-(\\+[^iau]+|$)' and fixed it by moving the \\+ to inside the optional group as it will generally be coming from a suffix and should be treated as optional. Tested the new regex on PATB and it is giving the required results.