kemitchell / unicode-ascii-equivalents.json

ASCII replacements for Unicode characters
https://npmjs.com/packages/unicode-ascii-equivalents
1 stars 0 forks source link

Add 'word break' property? #1

Open cscott opened 9 years ago

cscott commented 9 years ago

So, if you were to convert the string "2½" to ascii using this mapping, you'd get 21/2, which is not the same thing at all. But you wouldn't want to add extraneous spaces to "Give me ½ a second".

It seems like you'd need to add a boolean property, something like "word break", which would indicate that the given expansion should be surrounded by non-word characters to be sensible.

kemitchell commented 9 years ago

Oh no, he's going through my back catalog! :smile:

I thought about adding parentheses around the fractions. That would yield 2(1/2) in your example.

cscott commented 9 years ago

That would be unambiguous at least. But it does read a bit strange. I think spaces would be less objectionable:

I ate (1/2) an apple.

versus

I ate  1/2  an apple.

Note the "extra" spaces in the latter.

kemitchell commented 9 years ago

Since this is a static data file, I think unambiguous is good enough. If downstream authors of replacement routines want to implement flags, they can certainly do so. I'd imagine downstream users of those replacement routines could also pre- or post-process to achieve any customization they want, especially since it's strings through and through.