Open meisyal opened 3 years ago
To fix unit test number 1, we will replace the word "apatah", like "manatah" and "siapatah". Both of these words don't exist in default dictionary. Then, we will move "apatah" to unit test with custom dictionary.
For unit test number 2, we will do the same like previous comment. "belikan" will be replaced with other word that doesn't exist in default dictionary. "abaikan", "hijaukan", and "ramaikan" are the examples.
"belikan" has two meanings. The first meaning is to buy something and the second meaning is a field in the forest. This Ruby gem can't distinguish homograph. Gem limitation should be documented later.
For unit test number 4 and 6, we will move the words, "rerata", "lelembut", "idealis", and "idealisme", to unit test with custom dictionary as well.
Progress checklist:
If you check commit d0c6ae4, we still have six failing unit tests. These unit tests use default dictionary (Kateglo). Let's break them down one-by-one:
Unit test failed to stem "-lah, -kah, -tah, -pun" suffixes
This happened because the stemmer failed to stem "apatah" word. "apatah" word exists in default dictionary. So, it's considered a root word or no need to be stemmed.
Unit test failed to stem "-i, -kan, -an" suffixes
This happened because the stemmer failed to stem "belikan" word. The cause is the same with previous point.
Unit test failed to stem loop last return of enhanced confix stripping
This happened because the stemmer failed to stem "menerangi", "berimanlah", and "memuaskan" words. Need further investigation to find the cause.
Unit test failed to stem modified enhanced confix stripping with infix
This happened because the stemmer failed to stem "rerata" and "lelembut" words. These words exist in dictionary.
Unit test failed to remove prefix recursively
This happened because the stemmer failed to stem "kesepersepuluhnya" word. Need further investigation to find the cause.
Unit test failed to stem adopted foreign suffixes
This happened because the stemmer failed to stem "idealis" and "idealisme" words. Both exist in dictionary.