elijah-potter / harper

The Grammar Checker for Developers
https://writewithharper.com
Apache License 2.0
1.09k stars 23 forks source link

bug: a/an choice determined by letter consonant not phonetic consonant in acronyms #72

Closed grantlemons closed 1 month ago

grantlemons commented 5 months ago

Example: an LLM is valid, but haper suggests a LLM because L is a consonant, even though phonetically it begins with a vowel.

Probably not fixable.

elijah-potter commented 5 months ago

Actually, what's happening here is that the classifier doesn't include rules of capital letters. It is fixable--and I'm on it.

grantlemons commented 5 months ago

Just to summarize what I took from us discussing this the other day, the issue you've pointed out with not considering capital letters is a different, though related issue.

This is specifically about the way letter pronunciation in initialisms sometimes begin with vowels even when the letter itself is a consonant. i.e. the letter L is pronounced [el], so initialisms beginning with L begin phonetically with a vowel and so are referred to with an not a.

When we discussed this, I think you said you would just consider all capital letters like this (I don't really remember), but this isn't really the approach either. C, for example does not phonetically begin with a vowel.

You could resolve this by just hard coding by the letter, but that feels a bit jank.

Agent-E11 commented 3 months ago

I recently came across this as well. ("a/an SSH connection")

I don't know of any way to fix it other than hard-coding "weird-named letters":

Consonants with vowel sounds: F, H, L, M, N, R, S, X Vowels with consonant sounds: U, Y(?)

elijah-potter commented 3 months ago

I've committed a temporary fix--explicitly treating apparent acronyms/initialisms in a different way. If you compile from master you can give it a go.

elijah-potter commented 1 month ago

I think this one's good to go. Reopen if there are any further issues.