dkpro / dkpro-core

Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.
https://dkpro.github.io/dkpro-core
Other
196 stars 67 forks source link

Lancaster Stemmer custom rules configuration #1001

Closed mjunsilo closed 7 years ago

mjunsilo commented 7 years ago

The SMILE lancaster stemmer now supports custom rules to be specified. This feature will add a custom rules configuration parameter to the DKPro LancasterStemmer UIMA wrapper.

mjunsilo commented 7 years ago

Initial version can be previewed here:

https://github.com/mjunsilo/dkpro-core/commit/d9f8cf875b45b1eb306e3d1fa1b22b99cd7c7f81

Waiting for version of 1.2.2 SMILE API to be public available before making the pull request. This should happen soon.

reckart commented 7 years ago

@mjunsilo looks like SMILE is already at version 1.3.1 (https://github.com/haifengl/smile). Are you still planning on doing a PR on this one?

mjunsilo commented 7 years ago

We merged the latest changes to the lancaster stemmer with 1.2.2 a while ago, so this would only be to upgrade to the latest version of the library. I can have a look at it this week, and do a PR if all tests hold. I will just in case check that other modules don't have added this dependency as well so that the upgrade doesn't potentially break anything.

mjunsilo commented 7 years ago

I guess it would be overkill to do a new issue on this one, and I can just commit the version update directly to master if you prefer. All tests pass.

reckart commented 7 years ago

Please go ahead :)

mjunsilo commented 7 years ago

Closing the issue. Just committed an final update of smile-nlp to version 1.3.1.