snowballstem / snowball

Snowball compiler and stemming algorithms
https://snowballstem.org/
BSD 3-Clause "New" or "Revised" License
757 stars 173 forks source link

add Ukrainian lang #144

Closed tggo closed 1 year ago

ojwb commented 3 years ago

Looks like you pushed this branch before the corresponding snowball-data branch (the instructions do say to push snowball-data first) and the CI failed because the new testdata wasn't there yet.

I'll trigger it to rerun.

ojwb commented 3 years ago

@tggo The tests currently fail due to differences between the expected output in your snowball-data PR and what your algorithm actually produces, all involving apostrophes from a quick look. Please can you resolve this?

All languages are failing except Pascal (which passes because it only supports stemmers which can work in ISO-8859-1 currently) and Ada (which was passing due to a bug in the the make rule to run its tests - I've pushed a fix for that in 8b49140e5cb6675c58bd908dbd4c7e12146a4f72).

ojwb commented 3 years ago

With the apostrophe definition fix CI passes for everything except csharp, which is currently broken in master too. We have other wide character Unicode programming languages passing so I don't expect this hides a problem but I'll try to check it locally and merge the branch if it is indeed OK.

stefanvodita commented 2 years ago

What would it take to have this merged?

ojwb commented 1 year ago

@stefanvodita Thanks for your interest.

I've fixed CI on master and pushed a fix for a minor nit here (the new modules.txt entry wasn't in sorted order) which has demonstrated that the C# failure wasn't anything to do with these changes. This PR looks good now but needs the other two to be ready as well - I'll make some comments on those and @-mention you.

ojwb commented 1 year ago

I'm going to close this in favour of #178 as that's been actively worked on.