Right now, cities are a big object in JSON. A sample city would look like this (37 bytes):
"New York City":[40.71427,-74.00597],
If we switched to a single line format, used spaces or tabs as field delimiters and newlines as record delimiters, it could look like this (29 bytes with newline):
87G7PX7V+PJ2QX New York City
Eliminating the field separator and the plus would reduce this to 26 bytes.
Using a custom coding scheme, allowing for 64 printable characters for the coordinates and the city name, it could get the same resolution like this:
012..789abc...xyzABC...XYZ-_ = values from 0 to 63
Force longitude and latitude to both be positive numbers
Convert range into [0,1) by dividing by the longitude or latitude by 360° (keeping it the same for both means the formula is simpler)
Multiply by 64. Take Math.floor(result) to get the index into the string, then remove that from the result.
Repeat the last step enough times until you get high enough accuracy.
For the city database, ranges go to 0.00001°. That equates to 5 digits (with plenty of accuracy left over) or 4 digits with the loss of accuracy of about 0.0000214°, or 8 feet at the worst case. These lines would look like this (22 bytes with newline):
nffeOR-vNew York City
With this arrangement, math is simple and the encoding is fast. Using base96 would provide accuracy slightly beyond 0.00001° with the same number of digits. The code would have to fetch the text file and parse the result instead of receiving it as JSON, but that is fairly easy to overcome. With this in place, the file should be reduced to 178,711 bytes, meaning it is only 56% the original size and saving 139,401 bytes. Additional bytes could be saved by eliminating the duplicate entries for a single city when the city has accent characters, such as "Landstraße" and "Landstrasse" by combining them into "abcd1234Landstraße|Landstrasse" (the abcd1234 are fictitious placeholder values for the coordinates).
Right now, cities are a big object in JSON. A sample city would look like this (37 bytes):
If we switched to a single line format, used spaces or tabs as field delimiters and newlines as record delimiters, it could look like this (29 bytes with newline):
Eliminating the field separator and the plus would reduce this to 26 bytes.
Using a custom coding scheme, allowing for 64 printable characters for the coordinates and the city name, it could get the same resolution like this:
Math.floor(result)
to get the index into the string, then remove that from the result.For the city database, ranges go to 0.00001°. That equates to 5 digits (with plenty of accuracy left over) or 4 digits with the loss of accuracy of about 0.0000214°, or 8 feet at the worst case. These lines would look like this (22 bytes with newline):
With this arrangement, math is simple and the encoding is fast. Using base96 would provide accuracy slightly beyond 0.00001° with the same number of digits. The code would have to fetch the text file and parse the result instead of receiving it as JSON, but that is fairly easy to overcome. With this in place, the file should be reduced to 178,711 bytes, meaning it is only 56% the original size and saving 139,401 bytes. Additional bytes could be saved by eliminating the duplicate entries for a single city when the city has accent characters, such as "Landstraße" and "Landstrasse" by combining them into "abcd1234Landstraße|Landstrasse" (the abcd1234 are fictitious placeholder values for the coordinates).