Closed jonathan-kosgei closed 5 years ago
The writer deduplicates data that is inserted. Although you don't say it explicitly, it sounds like they data you are inserting is random, which it won't be able to deduplicate. If the data is random, it would take 64 MB to just store your attributes.
I'm testing creating internal mmdbs according to the getting started tutorial.
I'm able to successfully create an mmdb with 1M records and read it from Python.
The only problem is the file size of the mmdb is 75Mb, each IP range has a very simple data field attached to it eg.
The attribute value for every network is a 64 character long string. This is test data but the actual data will average the same length.
The problem is I need to add 14M more records, and if 1M records is 75Mb then 15M will possibly be greater than 1Gb.
How comes the geolite database and geoip city databases have a lot more data but are more compact in size?