jamestomasino / stutter

RSVP for browsers
https://addons.mozilla.org/en-US/firefox/addon/stutter/
GNU General Public License v3.0
136 stars 11 forks source link

thought on splitting words #45

Open thiswillbeyourgithub opened 4 years ago

thiswillbeyourgithub commented 4 years ago

Hi,

Currently the idea is that words longer than X characters should be split / hyphenated.

I was wondering what you would think about this instead : if the word is more than 13 characters long*, then add 0.2 time for each additionnal character. This way a 15 character word would be 1.4, and a 20 characters words like " counterrevolutionary, electroencephalogram, uncharacteristically." would stay 2.4

I say .2 but it could as well be .1. Or better yet : .1 between 13 and 20 and .2 above, as very long words take more time to understand.

This way : super long words would stay a long time (useful for german for example) and more so than "regular long words".

Sure it would "break the flow" of speed reading but hyphenating "counterrevolutionnary" can really be disorienting.

In effect, this would remove the idea of hyphenating long words and instead keep them on display for a longer period.

*like astonishingly, accommodation, cartographers

jamestomasino commented 4 years ago

I'm working on alternate parsing algorithm support for other languages, and that might be a good fit for germanic ones that concatonate to larger words. For English the performance metrics I read about suggested the congative delay in understanding a word longer than 13 characters was longer than if the word is broken up.

thiswillbeyourgithub commented 4 years ago

But where does you program add the hyphen ?

exampe : counterrevolutionnary <- long to read but a known word coun-terrevolutionnary <- confusing coun-terrevolut-ionnary <- confusing counterre-voluti-onnary <- confusing etc

Does your program usually add the hyphen between doubled letters ? counter-revolution-nary <- perfect

jamestomasino commented 4 years ago

the splitting strategy right now tries a whole series of things starting with known prefixes & suffixes, then common vowel/consonant patterns. I can look into double consonants as well and see how that performs.

thiswillbeyourgithub commented 4 years ago

I also speak french, if I can be of use don't hesitate to ask