biolab / orange3-text

🍊 :page_facing_up: Text Mining add-on for Orange3
Other
127 stars 84 forks source link

[ENH] Filter - Use ISO language in StopwordsFilter #1024

Closed PrimozGodec closed 1 year ago

PrimozGodec commented 1 year ago
Issue

This PR is part of #963, which I am splitting into smaller pieces for easier review. The main motivation behind this is to make Preprocess work with language from Corpus.

Description of changes

This PR prepare a stop word filter to communicate (get and return languages) as ISO codes, which is necessary to enable language from Corpus (languages are stored in Corpus in ISO format).

After I changed Stop Word to work with ISO language codes, I also had to adapt the Preprocess Widget to store settings as ISO codes and call the StopWords filter with ISO language code.

Includes
codecov-commenter commented 1 year ago

Codecov Report

Merging #1024 (3b5004f) into master (87a7580) will increase coverage by 0.08%. The diff coverage is 96.22%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1024 +/- ## ========================================== + Coverage 82.10% 82.19% +0.08% ========================================== Files 93 93 Lines 12257 12292 +35 Branches 1660 1668 +8 ========================================== + Hits 10064 10103 +39 + Misses 1881 1879 -2 + Partials 312 310 -2 ```