getGeoDataSync so slow with 3k ip list

PaddeK / node-maxmind-db

This is the pure Node API for reading MaxMind DB files. MaxMind DB is a binary file format that stores data indexed by IP address subnets (IPv4 or IPv6).

GNU Lesser General Public License v2.1

88 stars 25 forks source link

getGeoDataSync so slow with 3k ip list #9

Open nvcken opened 10 years ago

nvcken commented 10 years ago

I try getGeoDataSync vs geoip.lookup of geoip-lite module with the same 3k IP list getGeoDataSync run so slow what wrong?

the-eater commented 10 years ago

I think this is because of the way the MaxMind provided his data, while geoip-lite uses the legacy database, we use the new MaxMind DB (mmdb). the legacy database allows to predict the position where the record may be, while node-maxmind-db needs to read the whole tree till found.

am I correct @oschwald / @paddek?

oschwald commented 10 years ago

With the new format, cache misses are more likely due to how the data is stored, but I think the primary issue is that the new format provides significantly more data. One possible way to alleviate that is to allow looking up a subset of the data. For instance the C library allows looking up data by a path and the Go reader will only look up data specified in the struct passed in.

That said, I think the easiest speed improvement that I am not seeing in the code currently is to cache the IPv4 start node. This will save you 96 node reads when doing a lookup and is simple to implement. There may also be other simple improvements that profiling would detect.

mahnunchik commented 9 years ago

vbauer commented 9 years ago

:+1:

vivekkrbajpai commented 9 years ago

So is there any option or config to get better throughput form mmdb format ?

the-eater commented 9 years ago

Sorry for reacting late :( Im currently working on a version that does ipv4 caching and allows a path like suggested by @oschwald, you can see the progress on that here: https://github.com/EaterOfCode/node-maxmind-db/tree/moreSpeed

You can use paths currently by giving the path argument with a path to a value you want to know, f.x. if you only want to have the us name of a ip you can retrieve that by

mmdbreader.getGeoDataSync('8.8.8.8', ['country', 'names', 'en']);

Any feedback would be more then welcome!

knoxcard commented 5 years ago

https://github.com/runk/node-maxmind