maxmind / mmdb-from-go-blogpost

Enriching MMDB files with your own data using Go.
https://blog.maxmind.com/2020/09/01/enriching-mmdb-files-with-your-own-data-using-go/
Apache License 2.0
23 stars 6 forks source link

One Node dissapeared after Inserting data #22

Closed DVRusak closed 10 months ago

DVRusak commented 10 months ago

After executing this code, for some reason the new file becomes smaller by 250KB and becomes smaller by one less node too (it was 4960213, it became 4960212 - read from the metadata). Is this normal or is something done wrong? Just the code exactly from your article with an example. But the code does its job of enrichment, everything works

metadata before updating: maxminddb.reader.Metadata(node_count=4960213, record_size=28, ip_version=6, database_type='GeoLite2-City', languages=['de', 'en', 'es', 'fr', 'ja', 'pt-BR', 'ru', 'zh-CN'], binary_format_major_version=2, binary_format_minor_version=0, build_epoch=1664466441, description={'en': 'GeoLite2 City database'})

metadata after updating: maxminddb.reader.Metadata(node_count=4960212, record_size=28, ip_version=6, database_type='GeoLite2-City', languages=['de', 'en', 'es', 'fr', 'ja', 'pt-BR', 'ru', 'zh-CN'], binary_format_major_version=2, binary_format_minor_version=0, build_epoch=1698236150, description={'en': 'GeoLite2 City database'})

import ( "log" "net" "os"

"github.com/maxmind/mmdbwriter"
"github.com/maxmind/mmdbwriter/inserter"
"github.com/maxmind/mmdbwriter/mmdbtype"

)

func main() {

// Load the database we wish to enrich.
var path_to_db string
path_to_db = "GeoLite2-City.mmdb"

writer, err := mmdbwriter.Load(path_to_db, mmdbwriter.Options{})
if err != nil {
    log.Fatal(err)
}

// Define and insert the new data.
_, sreNet, err := net.ParseCIDR("56.1.0.0/16")
if err != nil {
    log.Fatal(err)
}

sreData := mmdbtype.Map{
    "AcmeCorp.DeptName": mmdbtype.String("SRE"),
    "AcmeCorp.Environments": mmdbtype.Slice{
        mmdbtype.String("development"),
        mmdbtype.String("staging"),
        mmdbtype.String("production"),
    },
}

if err := writer.InsertFunc(sreNet, inserter.TopLevelMergeWith(sreData)); err != nil {
    log.Fatal(err)
}

// Write the newly enriched DB to the filesystem.
fh, err := os.Create("GeoLite2-City-test.mmdb")
if err != nil {
    log.Fatal(err)
}
_, err = writer.WriteTo(fh)
if err != nil {
    log.Fatal(err)
}

}

oschwald commented 10 months ago

When I run this on the latest GeoLite2 City, I get the same node count before and after. If you are using an older database, it is likely due to node merging changes. I suspect you would see the same difference if you read in the database and immediately wrote it out, without inserting any additional data.

DVRusak commented 10 months ago

When I run this on the latest GeoLite2 City, I get the same node count before and after. If you are using an older database, it is likely due to node merging changes. I suspect you would see the same difference if you read in the database and immediately wrote it out, without inserting any additional data.

Maybe this is a stupid question... But, please tell me, will merging nodes affect anything? If yes, then for what purpose? After simply rewriting mmdb without inserting information, indeed, one node disappeared in the same way, as you said

oschwald commented 10 months ago

The merging or pruning of nodes is primarily a performance and space optimization. If a node contains two records that point to the same thing, then the node itself can be replaced with a record pointing to that thing. In terms of output, everything is the same except the depth or prefix length of the record.

Both older and newer versions of the writer do this. The 1 node difference, I believe, is due to a special case where there were two adjacent reserved networks that were not being merged as they have a special record type within the tree before writing.