cameron / squirt

Speed read the web.
http://www.squirt.io
Apache License 2.0
1.22k stars 205 forks source link

i.e., a.m., p.m., e.g., et cetera #43

Open ohubaut opened 10 years ago

ohubaut commented 10 years ago

Initialisms, when separated by "." are threaded as separate words, which make them harder to understand (e.g.: e.g. becomes e. g. ;-) ). Another use case are Initialisms that are not separated with dots but written all uppercase (e.g.: BTW ). Those are also harder to understand as they represent multiple words and you have to transcode them first.

Proposal: identify both patterns, display them as a single word, but increase the delay before the next word shows up.

j6k4m8 commented 10 years ago

Also consider a.m. / p.m., i.e., U.S.A., etc.

malcolmocean commented 10 years ago

Honestly I think BTW is more like a compound word, and doesn't need a delay. Nor do USA, pm, C.E.O or M.I.T. When I say "more like a compound word." I mean that while it etymologically is composed of several words, anyone who knows "BTW" is probably able to parse its meaning without actually deconstructing it, just like you don't mentally split birthday into birth + day.

j6k4m8 commented 10 years ago

Indeed — my point was that p.m. (and the like) should not get a p[pause]m[pause] behavior, as it does currently.

ohubaut commented 10 years ago

@malcolmmcc For native English speakers, that might be the case. But for those others, like me, that read articles in English on a regular basis, it still isn't obvious to deconstruct things such as YMMV, AFAIK, OTOH, etc...

malcolmocean commented 10 years ago

Fair point @gizmogwai. Maybe have that as a setting? Even better, a setting that lets those be turned into their component parts, so it would read AFAIK and output "As Far As I Know."

j6k4m8 commented 10 years ago

+1 to the expansion-option. Though we'd need to be careful that abbreviations for one thing don't get expanded to another (mediocre example: 'WTF' is an obscure family of genes). I suppose that'd be the use-case for the option in Settings.

malcolmocean commented 10 years ago

The other issue with this is that sometimes the abbreviations take on meanings of their own, which is related to what I was trying to get at earlier. A great example is lol. It now has a meaning that is extremely distinct from laugh out loud. Other initialisms/acronyms have this too, just less of it.

Malcolm McCulloch Cultivating possibilities www.malcolmm.cc

On 13 March 2014 16:04, Jordan notifications@github.com wrote:

+1 to the expansion-option. Though we'd need to be careful that abbreviations for one thing don't get expanded to another (mediocre example: 'WTF' is an obscure family of genes). I suppose that'd be the use-case for the option in Settings.

— Reply to this email directly or view it on GitHubhttps://github.com/cameron/squirt/issues/43#issuecomment-37579985 .

gadenbuie commented 10 years ago

Also related: numbers and money. $1.4 million currently shows up as $1. [pause] 4 [pause] million. I agree with @gizmogwai's proposal, I think a slightly increased delay (comma length) after $1.4 would work well, similarly for e.g..

adamchainz commented 10 years ago

Just came on here because I got the same think with 9.[pause]3[pause] minutes. +1 for fixing initialisms.

porcoesphino commented 10 years ago

@gizmogwai

"it still isn't obvious to deconstruct things such as YMMV"

Splitting the letters up one by one doesn't help you understand the meaning. A dictionary does, just like a word you haven't seen before. In my opinion, your issue is more similar to any user that hits a word they don't know or odd grammar and need time to work out what it means.

xdumaine commented 10 years ago

Agree with @porcoesphino. You can't understand acronyms or initialisms any better when presented letter by letter. If anything, it's worse, because it requires additional cognitive processing the recall the entire string and assign it context. When placed together, they eventually become as recognizable as regular words, once the reader knows them.

kristoffernolgren commented 10 years ago

I think there should be a general solution to this, that for something to be concidered a new sentence it should contain more than four/five characters, if it doesn't it should be concidered a single word. This would take care of a lot of abbriavations and other things, in all languages.

patcon commented 8 years ago

fyi possible oartial resolution in thread auto-reffed above.