This pull request adds a 10x speed up when the country for a phone number is unknown.
Ran tests locally, and they all pass.
Background ๐
Currently, when parsing international phone numbers this library allocates a significant amount of regexes (256 countries), and will unnecessarily match against all 256, although in practice it can only match a maximum of 1.
Country codes have 1, 2, or 3 digits, and have the interesting property that shorter codes are not prefixes of longer codes.
The global_phone library takes advantage of this to optimize country code detection.
Applying the techniques mentioned above to optimize detect_and_parse.
As a result, instead of creating 256 regexes and matching all of them every time a phone with an unknown country code was parsed, it will now perform only 3 hash lookups.
Benchmarks ๐
This optimization yields a 10x speed up when the country code is unknown! ๐
Added a new benchmark in spec/phonelib_ips_bench.rb, which can be run with rspec.
Before
Calculating -------------------------------------
known country 27.029 (ยฑ 0.0%) i/s - 136.000 in 5.032140s
unknown country 2.253 (ยฑ 0.0%) i/s - 12.000 in 5.325396s
Comparison:
known country: 27.0 i/s
unknown country: 2.3 i/s - 11.99x slower
After
Calculating -------------------------------------
known country 26.913 (ยฑ 0.0%) i/s - 136.000 in 5.053798s
unknown country 23.172 (ยฑ 0.0%) i/s - 116.000 in 5.006049s
Comparison:
known country: 26.9 i/s
unknown country: 23.2 i/s - 1.16x slower
Now the library will perform similarly when a country code is provided than when it needs to be detected.
Description ๐
This pull request adds a 10x speed up when the country for a phone number is unknown.
Ran tests locally, and they all pass.
Background ๐
Currently, when parsing international phone numbers this library allocates a significant amount of regexes (256 countries), and will unnecessarily match against all 256, although in practice it can only match a maximum of 1.
Country codes have 1, 2, or 3 digits, and have the interesting property that shorter codes are not prefixes of longer codes.
The
global_phone
library takes advantage of this to optimize country code detection.By taking the first three prefixes of digits, it's possible to do a hash-based lookup instead of cycling through all countries.
The Fix ๐จ
Applying the techniques mentioned above to optimize
detect_and_parse
.As a result, instead of creating 256 regexes and matching all of them every time a phone with an unknown country code was parsed, it will now perform only 3 hash lookups.
Benchmarks ๐
This optimization yields a 10x speed up when the country code is unknown! ๐
Added a new benchmark in
spec/phonelib_ips_bench.rb
, which can be run withrspec
.Before
After
Now the library will perform similarly when a country code is provided than when it needs to be detected.
If we combine this with the work in:
it should make both cases even faster, and make both cases comparable in performance (only 1.03x slower).
Memory Usage ๐
After this pull request, this use case allocates 5x less memory, so GC pressure will be mitigated as well.
Before
After