oschwald / maxminddb-golang

MaxMind DB Reader for Go
ISC License
615 stars 101 forks source link

Data inconsistency on decoding #115

Closed jtolio closed 1 year ago

jtolio commented 1 year ago

Hello!

With a relatively recent maxmind country-level DB (let me know if you want my exact copy), we have found at least one IP address that decodes to a different country code based on which de-serialization type is used. This is a bit concerning (we expect it has something to do with registered vs represented countries but we haven't nailed it down).

Here's the difference:

var data map[string]map[string]any
reader.Lookup("95.140.157.134", &data)
fmt.Println(data["country"]["iso_code"])

This says Russia.

var data struct {
    Country struct {
        IsoCode string `maxminddb:"iso_code"`
    } `maxminddb:"country"`
}
reader.Lookup("95.140.157.134", &data)
fmt.Println(data.Country.IsoCode)

This says Poland.

What's going on here?

jtolio commented 1 year ago

Additional information:

If I do the first style (map-based decoding), both "country" and "registered_country" say Russia.

However, if I do the second style (struct-based decoding), "country" is Poland, and "registered_country" is Russia.

So I think map-style is broken.

oschwald commented 1 year ago

I can't reproduce this:

package main

import (
    "fmt"
    "net"

    "github.com/oschwald/maxminddb-golang"
)

func main() {
    reader, err := maxminddb.Open("/usr/local/share/GeoIP/GeoLite2-City.mmdb")
    if err != nil {
        panic(err)
    }
    var data map[string]any
    err = reader.Lookup(net.ParseIP("95.140.157.134"), &data)
    if err != nil {
        panic(err)
    }
    fmt.Println(data["country"].(map[string]any)["iso_code"])

    var data2 struct {
        Country struct {
            IsoCode string `maxminddb:"iso_code"`
        } `maxminddb:"country"`
    }
    err = reader.Lookup(net.ParseIP("95.140.157.134"), &data2)
    if err != nil {
        panic(err)
    }
    fmt.Println(data2.Country.IsoCode)
}

Output:

PL
PL

I am guessing you are ignoring the error from Lookup, which might be related. I had to change the type to something that it could decode to.

jtolio commented 1 year ago

I only removed the error handling for the purposes of this issue submission. Here is a full test for me:

    type countryCode struct {
        Country struct {
            IsoCode string `maxminddb:"iso_code"`
        } `maxminddb:"country"`
    }

    var data countryCode
    err = mapper.reader.Lookup(net.ParseIP("95.140.157.134"), &data)
    if err != nil {
        panic(err)
    }
    fmt.Println(data.Country.IsoCode)

    var data2 map[string]map[string]any
    err = mapper.reader.Lookup(net.ParseIP("95.140.157.134"), &data2)
    if err != nil {
        panic(err)
    }
    fmt.Println(data2["country"]["iso_code"])

    var data3 map[string]any
    err = mapper.reader.Lookup(net.ParseIP("95.140.157.134"), &data3)
    if err != nil {
        panic(err)
    }
    fmt.Println(data3["country"].(map[string]any)["iso_code"])

This outputs:

PL
RU
PL

Notably, your formulation (map[string]any instead of map[string]map[string]any) works as expected.

oschwald commented 1 year ago

As suggested above, your version fails for me. I receive:

panic: maxminddb: cannot unmarshal array into type map[string]interface {}

goroutine 1 [running]:
main.main()
    /home/greg/MaxMind/maxminddb-golang/t/main.go:26 +0x2ba
exit status 2

This is because the subdivisions are an array, not map.

If I ignore the error, I see the behavior you describe. However, I wouldn't expect there to be valid data set by a method that returns an error and I wouldn't consider that a bug.

jtolio commented 1 year ago

This fails, even with the latest tag, v1.11.0? I didn't think to try the latest dev branch, but I will do that tomorrow. Perhaps there is some difference between your branch and the tagged release I am using. All of the returned errors are nil in my code--none of the panics in my example code above are triggered. Could this difference be in the format of the mmdb file I have?

oschwald commented 1 year ago

I have found an issue that I think is the cause of what you are seeing. I'll do a release shortly.

oschwald commented 1 year ago

1.12.0 has been released.

That said, I wouldn't recommend decoding to a map[string]map[string]any. It is much slower than decoding to a struct that contains just the data you want and it is more fragile than decoding to map[string]any as it makes possibly unfounded assumptions about what is in the database.

jtolio commented 1 year ago

Great, thank you! 1.12.0 is fixed. We do use the struct approach, but upon discovering this discrepancy we were worried which method was correct.