bitwalker / stringex

A string extensions library for node.js
https://www.npmjs.com/package/stringex
15 stars 4 forks source link

U+0080 used as Euro symbol #6

Closed bjwyse closed 10 years ago

bjwyse commented 10 years ago

I know this is a direct port of the Ruby library and an awesome port it is too but I find that U+0080 is used for €.

See http://www.westmeathexaminer.ie/news/roundup/articles/2014/07/23/4031635-tesco-have-1m-for-good-causes-and-want-westmeath-to-be-involved/ for example. If I download that and decode (ISO-8859-1) I find that the Euro symbol in h1 is U+0080. (Weirdly, other Euro symbols in the same content use U+20AC...)

IMO stringex should do this conversion too. I have forked and can submit a PR if anyone agrees?

Anyways, great job with this. Was exactly what I needed.

bitwalker commented 10 years ago

@bjwyse U+0080 is not the correct character, it's actually a control character, as you can see here. The correct character code for the euro symbol is U+20AC as you saw (confirmed here).

I'd rather not make this change unless we know for sure it's correct, and based on what I'm seeing, it's not. I'll leave this open in case you want to loop anyone else in, or if anyone watching the project wants to chime in.

Thank you very much for the kind words though, if there is anything I can help with let me know!

bjwyse commented 10 years ago

Yep, I saw that on fileformat.info and this caught my eye:

capture

You are right that is meant to be used as control character and I am not sure why it is used as the Euro symbol. I guess I needed stringex to provide the Euro symbol for U+0080 and thought that others might too.You can close the issue if you like as we now have some discussion on the topic for future users of stringex with the same problem that might want to apply the same fix.

bitwalker commented 10 years ago

I actually tracked down the reference for this here. You'll notice in that first table under Number Sign, this bit:

0x80 | U+20AC | EURO SIGN (€)

So it appears this is something that stringex should indeed be handling. If you want to open a PR with your changes, I'll merge them if everything looks good. Thanks!