Closed andyjessen closed 2 years ago
Thanks for the contribution! Is this motivated by some real cases you've run into? I don't quite remember whether we had any particular reason for the current length constraint, or it was more of an arbitrary choice. I'm always a bit hesitant to make subtle changes to default behavior.
No. I don't have a real case. I noticed that the code doesn't match the comment above it. It also doesn't match the algorithm from the 2003 paper.
Short forms are considered valid candidates only if they consist of at most two words, their length is between two to ten characters, at least one of these characters is a letter, and the first character is alphanumeric.
I believe it actually still doesn't match the 2003 paper (which is fine). The paper says Short forms are considered valid candidates only if they consist of at most two words, their length is between 2 to 10 characters, at least one of these characters is a letter, and the first character is alphanumeric
. We allow each individual word to be between length 2 and 10, vs the original that requires the whole sequence to be between length 2 and 10. In the absence of a compelling reason to change the default behavior, I'm inclined to just update the comment to reflect what the code does. Does that seem reasonable?
Yes. That sounds good.
Short form filter omits words with length of 10. Consider changing to be inclusive of 10.