elixir-geolix / geolix

IP information lookup provider
Apache License 2.0
190 stars 18 forks source link

lookup geoname_id #17

Closed adrienmo closed 7 years ago

adrienmo commented 7 years ago

Hi @mneudert,

I am using your library to convert my user ip into their location geoname_id and then I am saving this information in a database. I am saving the geoname_id instead of the country/city name, because my platform is multi-language and for size concerns I don't want to store duplicated data.

In my web interface, I am currently looking up the geoname_id against a mysql database that contains maxmind data. It works fine but it feels somehow dirty to rely on two different database for the same information. I think this library could be improve to give methods to do a lookup from the geoname_id to the actual location information. It is also useful for one country or subdivision to get the list of its subdivision or cities. (I am thinking about populating a drop down used to filter my user data).

the new added function of the geolix library would be:

get_countries(opts \\ [language: :en])
get_country(geoname_id, opts \\ [language: :en])
get_subdivisions(geoname_id, opts \\ [language: :en])
get_subdivision(geoname_id, opts \\ [language: :en])
get_cities(geoname_id, opts \\ [language: :en])
get_city(geoname_id, opts \\ [language: :en])

This data would be stored into an ETS table once converted from the maxmind format for fast access.

I wrote a gist to implement this: https://gist.github.com/adrienmo/eaca761f8d1539810bce722521797ef7

At the moment I am generating these ETS table from the city database. It takes around 30 seconds on my computer to process the geolite2 city data this is because the operation is not parallelized, this could be improved. Once the database created we could also give a function to store it into dets, so it does not need to regenerate this database at each startup.

I would like to know your thoughts on these improvements. If you think it can be useful I will implement this properly and make a merge request, otherwise I might do a separate library or a fork.

Cheers!

mneudert commented 7 years ago

I like the idea but I feel it is somewhat out of scope here unless I am missing something.

The problem is the adapter configuration you are relying on. With the current MMDB2 adapter being the default one it works but that might change or only work in your special use case. An adapter using the MaxMind API or a different database might not provide the lookups you are doing.

What I am thinking of is extracting the MMDB2 logic (mainly the Geolix.Adapter.MMDB2.Decoder module) to a standalone library. That should make it really easy for you to implement your logic (well, you have already done that...) without tying yourself to a specific adapter.

Would that separate decoder library still help you?


Side note: I do not think you can improve your parsing much in terms of parallel processing. The MaxMind DB File Format does not contain any hints on where the next "top level structure" begins. So you can only walk from one structure to the next if you want to reliably parse the data.

adrienmo commented 7 years ago

Thanks for your reply. Indeed separating the decoder in a library would be helpful to keep the code clean for my small project. Besides MMDB2, which format or provider do you plan to integrate ?

mneudert commented 7 years ago

Right now there are 3 providers on my bucket list:

There are some more providers like IPligence or neustart IP Intelligence but those are currently not my focus. Having adapters makes it easy for anyone to implement those themselves after all :D

mneudert commented 7 years ago

Took some time but by now I have extracted the core MMDB2 logic to a separate module (available on hex).

If you encounter any problems while adapting your code please open an issue there so we can sort this out.