catamphetamine / libphonenumber-js

A simpler (and smaller) rewrite of Google Android's libphonenumber library in javascript
https://catamphetamine.gitlab.io/libphonenumber-js/
MIT License
2.79k stars 216 forks source link

Formatting phone numbers in different numbering systems #221

Closed laszbalo closed 6 years ago

laszbalo commented 6 years ago

Hi,

my issue is slightly related to this, but I am interested in formatting phone numbers in different formatting systems rather than parsing them.

I am playing with the Intl package, specifically with the Intl.NumberFormat. Basically it formats numbers according to different locales (e.g. en, en-GB, ar). When I create an instance of it, it can tell the numbering system which was resolved for the passed in locale.

E.g. for en, en-GB, ... -> 'latn' but for ar, ar-AE, ... -> 'arab'

Please note, that 'latn' and 'arab' is used according to the Intl spec.

Consider the following use case:

There are two pages, page one contains a Russian phone number in both Russian and international style, these are respectively: 8 (965) 531-23-45 +7 965 531-23-45

and page two contains an UAE phone number in both UAE and international style: 050 327 1234 +971 50 327 1234

For a visitor with the ar, ar-AE, ..., etc. locale, I'd like to show 'arab' numbers , including the phone numbers.

Can your library be used in this use case?

P.S. I used 'arab' as an example alternative to 'latn', because that is the second most used numeral system after 'latn', but according to the unicode CLDR data, there are two locales which resolves to 'beng' (Bengali numerals), three which resolves to 'arabext', other trhee which resolves to 'deva' (used by Nepalese people), and one which resolves to 'mymr' (used in Myanmar).

catamphetamine commented 6 years ago

Thanks for the elaborate feature request. I too was thinking the same question of whether libphonenumber-js supports formatting phone numbers in Arabic scripts. That would certainly be more user-friendly for those people. Seems that currently Google's libphonenumber-js only supports parsing Arabic scripts (which I copy-pasted in this library) but it doesn't support formatting in Arabic scripts. But, I think it's a quite trivial thing to implement. E.g. we receive the output 050 327 1234 and then we simply .replace() all Latin script digits with Arabic script ones. I guess that would work. Perhaps it could even be part of this library, like a separate function, e.g. localizeDigits(number: string, script: string) : string which would simply replace the digits. There seems to be a lot of glyphs, about 20 of them: https://en.wikipedia.org/wiki/Hindu%E2%80%93Arabic_numeral_system#Glyph_comparison Anyway, if you make it into a separate function localizeDigits.js then I'll accept such pull request I guess. If you think this should be implemented differently then share your thoughts. Here's an example of how libphonenumber-js handles Eastern-Arabic digits: https://github.com/catamphetamine/libphonenumber-js/blob/ac7b73de39c68a0bd8342525f67f54b5f95ed72e/source/common.js#L38-L80

laszbalo commented 6 years ago

From the top of my head these are the characers used to style the 'latn'-digit phone numbers: space(breaking, \u0020), dash, '(' and ')'. The '+' is a control character, not a formatting one. I am not sure that by simply replacing the digits and leave the grouping/formatting characters in place will work with all numbering system. E.g. let just assume the following phone number is valid and formatted as follows: 123 456

if I just replace the 'latn' digitis with 'arab' ones and leaving the formatting intact (breaking space: u0020) this is what I get: ١٢٣ ٤٥٦ // the '123' is on the right the '456' is on the left

alternatively substituting the breaking space with a non-breaking one(nbsp in Html): ١٢٣ ٤٥٦ // now the '123' is on the left, '456' is on the right

Which format does it makes sense for an Arabic speaker? Or is there a different character set to format/group the 'arab' numerals? Does a phone number in 'arab' numerals intertwined with '(', ')', and '-' still make sense? - if not, the libphone number regexps do not fit for that numbering system, imo Aslo, is there any value in showing the phone numbers in 'arab' digits at the first place? (checked the website of UAE's Etisalat and one Saudi Arabian phone company, and in the arabic locale the phone numbers were written in 'latn' digits)

@FakhruddinAbdi might chime in and help ous out with the 'arab' numerals.

What I think a sensible first steps would be, is just ignoring the formatting characters entirely and replacing the digits for each numbering system. e.g.: Russian numbers 8 (965) 531-23-45 +7 965 531-23-45

will be first ripped from the country-specific formatting characters, and then the digits will get replaced: 89655312345 +79655312345 // plus character will probably remain, but I should be checked if there is not a more suitable alternative for the given numbering system/language

so Russian numbers in arabic would be: ٨٩٦٥٥٣١٢٣٤٥ // local format +٧٩٦٥٥٣١٢٣٤٥ // international format

Later on if someone will come up with some country-specific style for formatting a number in a given numbering system, just incorporate the modification to the code. On top of this, we could just generate a long list for each 'country' x 'formatting system' combo, given the existing country level formatting rules, and start collecting feedback from people who use those numeral systems.

e.g.: Russian phone numbers 8 (965) 531-23-45 +7 965 531-23-45

with 'arab' numerals: ٨ (٩٦٥) ٥٣١-٢٣-٤٥ +٧ ٩٦٥ ٥٣١-٢٣-٤٥

Dear arabic speaker, does the above look right to you, if not please let use know? ... so on so forth for each country and numbering combo.

I will create the PR for the first step soon. Hopefully people who know the other systems will find this thread and share their toughts on this issue.

catamphetamine commented 6 years ago

@laszbalo Thanks for the detailed comment.

Which format does it makes sense for an Arabic speaker?

Yeah, I have no idea, man. Ideally some skilled developer from an Arabic country should come here and tell us what is the right format for an equivalent European phone number and how Arabic people think phone numbers should be formatted for their script. Otherwise it's just a guessing game.

FakhruddinAbdi might chime in and help ous out with the 'arab' numerals.

Good idea.

ahmedHusseinF commented 6 years ago

I am an native arabic speaker and a kind of a js developer 😄 , numbers in arabic is written from left to right although the language is written from right to left so

77542 === ٧٧٥٤٢

so replacing the characters with a non-breaking space is the right way.

if there is any other question, just feel free to ask.

catamphetamine commented 6 years ago

@ahmedHusseinF Thx. I'll close this issue for now because it's technically not a manifestation of a bug but rather a place of discussion. In case of any internationalization bugs this issue could be re-opened later.