mholtzscher / syllapy

Calculate syllable count for English words.
MIT License
32 stars 8 forks source link

Best way to give you new syllable count exceptions? #92

Open harrisj opened 3 years ago

harrisj commented 3 years ago

I have been doing some spot checks and have about 445 additional exceptions for syllable counts I can add to the file you have. I realize though that it might be a really frustrating experience to review as a PR, especially if you didn't want to add some of them. Is there a preferred way I should contribute some additions back to you:

  1. One big PR with all the changes
  2. Staggered across several PRs, perhaps alphabetically?
  3. Let you figure out what exceptions you want to add and not add (you can always check my list of exceptions)

I also wanted to share that it does look like there are few cases that seem to repeat a bit, in case it's useful for your algorithm (many of them seem like special cases):

  1. Past-tense words that end in -sed or -ked like poised or marked are often coded as 2 syllables
  2. Words that end in e that are pluralized like graves or gives
  3. Words that end in -ism like journalism or socialism seem to undercount the last syllable
  4. Words that end in -ly seem to not count the adverb syllable

I also realize this is controversial, but I count hour as 2 syllables for instance, but I don't know if everybody does

aeonsablaze commented 3 years ago

I also have a few exceptions I have identified while using this module (although nowhere near as many as the OP) and would like to know how to go about submitting them.

On the hour issue, it's largely a regional distinction so the rather unhelpful answer is that there is no right answer. Hooray for english.