oschwald / maxminddb-golang

MaxMind DB Reader for Go
ISC License
576 stars 99 forks source link

When an IPv4 is used inside an IPv6 16 bytes array the prefix is zeroed instead of V4 prefixed #56

Closed elico closed 4 years ago

elico commented 4 years ago

Related wrong code: https://github.com/oschwald/maxminddb-golang/blob/master/traverse.go#L26

The GoLang net.IP.To4() function detects if an IPv4 exists inside the IPv6 Byte array. Since this library uses a zeroed instead of the predefined one: var v4InV6Prefix = []byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff}

The net.IPNet.IP.To4() always returns a nil instead of 4 Byte array. To reproduce the Fix I am currently using I created the next example: https://play.golang.org/p/y-A-SC8fs9u

To understand better, this is the To4() function source

// To4 converts the IPv4 address ip to a 4-byte representation.
// If ip is not an IPv4 address, To4 returns nil.
func (ip IP) To4() IP {
    if len(ip) == IPv4len {
        return ip
    }
    if len(ip) == IPv6len &&
        isZeros(ip[0:10]) &&
        ip[10] == 0xff &&
        ip[11] == 0xff {
        return ip[12:16]
    }
    return nil
}

To fix this issue I propose to make a small adjustment to the sources at the relevant function: https://play.golang.org/p/nj68eVsVbRv

oschwald commented 4 years ago

I am not clear on what problem you are trying to solve. Would you be able to provide a test case that shows the issue? The only place To4() is used is in reader.go. If the issue is the IPv4 addresses at ::/96, they also are at ::ffff:0:0/96 in MaxMind databases. The fact that all locations are traversed is documented.

elico commented 4 years ago

@oschwald The issue is not about To4() exactly but rather the way the library implements IPv4 inside IPv6 ie IPv4 inside a 16 Bytes array.

The test case is very simple: I am running the ExampleReader_Networks from: https://github.com/oschwald/maxminddb-golang/blob/master/example_test.go#L56

This function is expecting the output:

    // Output:
    // ::100:0/120: Dialup
...
    // ::50d6:0/116: Cellular
...
    // ::cfb3:3000/116: Cellular
    // 1.0.0.0/24: Dialup

The first IPv6 CIDRs ::100:0/120 are wrong since they have zeroed Bytes from the 0 to 11. There are no such IPv6 addresses however it resembles the IPv4 inside IPv6 format with just one difference: the IPv4 inside IPv6 Bytes array is expected to bytes 10 and 11 with the value 0xff.

I ran a simple test on the db file: GeoLite2-Country_20190806/GeoLite2-Country.mmdb


// Is p all zeros?
func isZeros(p net.IP) bool {
    for i := 0; i < len(p); i++ {
        if p[i] != 0 {
            return false
        }
    }
    return true
}

// SanitizeIPv6 converts the IPv6 address ip to a 4-byte representation.
// If ip is not an IPv4 address, SanitizeIPv6 returns the original IP object.
func SanitizeIPv6(ip net.IP) net.IP {
    var v4LeadingPrefix bool
    var twealveLeadingZeroes bool

    if isZeros(ip[0:10]) && ip[10] == 0xff && ip[11] == 0xff {
        v4LeadingPrefix = true
    }
    if isZeros(ip[0:11]) {
        twealveLeadingZeroes = true
    }
    if len(ip) == net.IPv6len && !v4LeadingPrefix && twealveLeadingZeroes {
        ip[10] = 0xff
        ip[11] = 0xff
    }
    return ip
}

// SanitizeIPv6IPNet converts the IPv6 address ip to a 4-byte representation.
// If ip is not an IPv4 address, SanitizeIPv6IPNet converts returns the original IP object.
func SanitizeIPv6IPNet(ip *net.IPNet) *net.IPNet {
    var v4LeadingPrefix bool
    var twealveLeadingZeroes bool

    if isZeros(ip.IP[0:10]) && ip.IP[10] == 0xff && ip.IP[11] == 0xff {
        v4LeadingPrefix = true
    }
    if isZeros(ip.IP[0:11]) {
        twealveLeadingZeroes = true
    }
    if len(ip.IP) == net.IPv6len && !v4LeadingPrefix && twealveLeadingZeroes {
        ip.IP[10] = 0xff
        ip.IP[11] = 0xff
    }
    return ip
}

func TestMaxMinddb(t *testing.T) {
    geoliteCountryFile := "test-files/GeoLite2-Country_20190806/GeoLite2-Country.mmdb"

    db, err := maxminddb.Open(geoliteCountryFile)
    if err != nil {
        log.Fatal(err)
    }
    defer db.Close()
    fmt.Println(db.Metadata.IPVersion)

    var country Country

    networks := db.Networks()
    i := 0
    for networks.Next() {
        if i == 10 {
            break
        }
        subnet, err := networks.Network(&country)
        if err != nil {
            log.Fatal(err)
        }
        fmt.Println("------------------")
        fmt.Printf("%+v => %v\n", subnet, country.Country.IsoCode)
        fmt.Printf("%+v => %v\n", []byte(subnet.IP), country.Country.IsoCode)

        fmt.Println("----")
        SanitizeIPv6IPNet(subnet)
        fmt.Printf("%+v => %v\n", subnet, country.Country.IsoCode)
        fmt.Printf("%+v => %v\n", []byte(subnet.IP), country.Country.IsoCode)

        i = i + 1
    }
}

The output is showing that for some reason a valid IPv4 CIDR block which is registered under AU and many others are displayed wrongly ie: ::100:0/120 is actually a 1.0.0.0/24 and was supposed to be represented by the IPv6 Address bytes

Running tool: /usr/local/go/bin/go test -timeout 30s github.com/elico/ipfilter -run ^(TestMaxMinddb)$ -v

=== RUN   TestMaxMinddb
6
------------------
::100:0/120 => AU
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0] => AU
----
1.0.0.0/24 => AU
[0 0 0 0 0 0 0 0 0 0 255 255 1 0 0 0] => AU
------------------
::100:100/120 => CN
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0] => CN
----
1.0.1.0/24 => CN
[0 0 0 0 0 0 0 0 0 0 255 255 1 0 1 0] => CN
------------------
::100:200/119 => CN
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 2 0] => CN
----
1.0.2.0/23 => CN
[0 0 0 0 0 0 0 0 0 0 255 255 1 0 2 0] => CN
------------------
::100:400/118 => AU
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 4 0] => AU
----
1.0.4.0/22 => AU
[0 0 0 0 0 0 0 0 0 0 255 255 1 0 4 0] => AU
------------------
::100:800/117 => CN
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 8 0] => CN
----
1.0.8.0/21 => CN
[0 0 0 0 0 0 0 0 0 0 255 255 1 0 8 0] => CN
------------------
::100:1000/116 => JP
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 16 0] => JP
----
1.0.16.0/20 => JP
[0 0 0 0 0 0 0 0 0 0 255 255 1 0 16 0] => JP
------------------
::100:2000/115 => CN
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 32 0] => CN
----
1.0.32.0/19 => CN
[0 0 0 0 0 0 0 0 0 0 255 255 1 0 32 0] => CN
------------------
::100:4000/114 => JP
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 64 0] => JP
----
1.0.64.0/18 => JP
[0 0 0 0 0 0 0 0 0 0 255 255 1 0 64 0] => JP
------------------
::100:8000/113 => TH
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 128 0] => TH
----
1.0.128.0/17 => TH
[0 0 0 0 0 0 0 0 0 0 255 255 1 0 128 0] => TH
------------------
::101:0/120 => CN
[0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0] => CN
----
1.1.0.0/24 => CN
[0 0 0 0 0 0 0 0 0 0 255 255 1 1 0 0] => CN
--- PASS: TestMaxMinddb (0.01s)
PASS
ok      github.com/elico/ipfilter   0.009s
Success: Tests passed.

My solution is a very basic one which do not fix the original issue which is a bad DB uint128 value decoding/reading. It would be much smarted to create the uint128 which represents a IPv4 as expected by iana documentation and with a ip[10], ip[11]= 0xff , 0xff instead of 0 value.

oschwald commented 4 years ago

@elico, Networks() is currently behaving as documented. It traverses all records in the database, including the multiple locations of the IPv4 range. The proposed PR would further confuse matters as it makes it seem like the IPv6 mapped IPv4 range is repeated, as your modified example demonstrates.

A PR that I would consider would be an optional flag to skip all of the IPv4 locations that MaxMind uses other than ::ffff:0:0/96, i.e., ::/96, 2001::/32, and 2002::/16. See the relevant writer code. Again, this would need to be optional and the behavior for existing users would need to remain unchanged.

elico commented 4 years ago

@oschwald OK I will close the PR since it's relevant but I will try to followup in the issue to understand what might be wrong and how...

elico commented 4 years ago

@oschwald Where do I see the MaxMind DB documentation about the existence these ::/96 and other ranges? The code reference indeed helps but it's really not clear to me why would the DB need to hold two records which technically holds the same record.

oschwald commented 4 years ago

The database only holds one copy, but the copy appears in multiple places, namely:

These show up in the IPv6 space as they are all ways to map IPv6 addresses to IPv4 for different purposes.

oschwald commented 3 years ago

master now allows you to pass the maxminddb.SkipAliasedNetworks option to the Networks method, which will skip these duplicate copies of the IPv4 network.

elico commented 3 years ago

@oschwald Thanks!