Closed ses4j closed 3 years ago
I believe that's tripping the heuristic that a string of all consonants is most likely an acronym, and therefore should be capitalized.
Closing I don't think there's an actual issue here -- feel free to re-open if appropriate.
I see. That makes sense, but also that seems like a heuristic that will cause more harm than good. Can it be controlled or disabled? Maybe documented? The README says "The filter employs some heuristics to guess abbreviations that don't need conversion." but this is guessing acronyms that do, which is different and causes us quite a bit of trouble.
At the end of the day, a regex+huersitics-based approach is always going to be imperfect. The wordlist feature should hopefully produce a reasonable escape-hatch for domain specific acronyms.
Why are PCL BCL and BCT made into all-caps when I run this:
I have no external wordlist file that I'm aware of.