intuit / fuzzy-matcher

A Java library to determine probability of objects being similar.
Apache License 2.0
226 stars 69 forks source link

Phone number assumed to be a US number #56

Closed mayurmadnani closed 1 year ago

mayurmadnani commented 2 years ago

Hi @manishobhatia, In the pre-processing steps for a phone number country code '1' gets added to the number making it specific to US. I would like to work on the fix for this. Here's what I have in mind. As part of normalization, will not add country code but rather remove if present and so that phone number match happens without the country code.

manishobhatia commented 2 years ago

Hi @mayurmadnani

Thanks for reporting this issue. The library can be made agnostic of country and this is a good issue to fix that.

But not matter if we add or remove, we will have to do it for all country codes possible .

Here are some pointers to think about:

hope this helps.

AdityaSoni19031997 commented 1 year ago

If we can think of having a generic solution without maintaining a repository of all country codes, that will be good.

I doubt one can do this without having the info on country-codes pre-cached. We can accept it from the user but user also probably will google it up etc. + Prefixes won't be consistent either, it's not limited to the 4 variations shared above, it can be lot more than that and all cannot be covered. And thus one cannot do string slice of fixed length etc to fetch split the number into country code and contact info.

manishobhatia commented 1 year ago

going to close this for now, will reopen as new request for this change comes in