golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.66k stars 17.61k forks source link

net/netip: add Prefix.Compare and AddrPort.Compare #61642

Open danderson opened 1 year ago

danderson commented 1 year ago

In our code, we've ended up writing ad-hoc ordering functions for netip.Prefix a whole bunch of times, for use with sort.Slice and the new slices.SortFunc. Writing those ordering functions is repetitive, and just tricky enough that it's easy to write comparators that don't conform to the contracts that sort and slices specify.

I propose adding Prefix.Less and Prefix.Compare to net/netip, so that people don't have to write their own ordering functions for prefixes. Less is a trivial wrapper around Compare. Compare orders two prefixes as follows:

An example ordering:

0.0.0.0/0
10.0.0.0/8
11.0.0.0/8
10.0.0.0/24
::/0

Back when we first wrote net/netip, we hesitated to add Less and Compare to prefix, because there didn't seem to be an obvious order we could impose: Prefixes organize themselves naturally into a binary tree rather than a flat list. So, Contains has an obvious definition, as does Overlaps, but Less and Compare would require us to come up with some arbitrary flattening of the tree to make sense.

Now that we have a bit more experience with using net/netip in the wild, I have two arguments for implementing Less and Compare as defined above:

While we're in there, AddrPort also lacks comparators, so we could add those as well. The above discussion doesn't apply to AddrPort, in that there is an obvious ordering available (order by Addr first, then Port). Looking back, I believe we simply forgot to make AddrPort's API consistent with Addr.

rsc commented 1 year ago

Can we get by with just Compare now?

danderson commented 1 year ago

Probably, yes. Less is still reasonably common when wielding the stdlib in 1.20 due to sort.Slice, but if we think the slices package supersedes that from now on, then it seems fine to make people write return foo.Compare(bar) < 0 instead of return foo.Less(bar). I don't think anyone would miss Less if Compare were provided.

rsc commented 1 year ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

rsc commented 1 year ago

Retitled to be just about Compare but to also add AddrPort. This seems reasonable. Have all remaining concerns about this been addressed?

rsc commented 1 year ago

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

danderson commented 1 year ago

I'm not aware of any remaining concerns. Doing just Compare LGTM, that's the only feedback so far apart from thumbs-ups.

nightlyone commented 1 year ago

I miss a discussion that the suggested ordering is a commonly used one.

Is there any popular software that sorts it that way?

Would be sad, if we later discovered that Go does it differently and a lot of Go software needs to implement it themselves differently to be compatible. (maybe even after getting bug reports about that)

rsc commented 1 year ago

@nightlyone, @danderson wrote above:

Empirically, there does exist a generally accepted ordering for prefixes: it's the ordering from ip route show, and virtually all other network-oriented software.

rsc commented 1 year ago

No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group

gopherbot commented 1 year ago

Change https://go.dev/cl/524616 mentions this issue: net/netip: add AddrPort.Compare and Prefix.Compare

gaissmai commented 10 months ago

@nightlyone, @danderson wrote above:

Empirically, there does exist a generally accepted ordering for prefixes: it's the ordering from ip route show, and virtually all other network-oriented software.

I know I'm way too late to join the discussion, but I just saw the topic in the 1.22 RelNotes. I don't think this is the correct sort order for CIDRs.

I've never seen a routing table in that order: 10.0.0.0/8 0.0.0.0/32

and this would be the sort order of this proposal

eugene-che commented 10 months ago

+1 to @gaissmai

I would argue that the "generally accepted ordering for prefixes" rather sorts by:

  1. Address family (e.g. IPv4 before IPv6)
  2. Then by pfx.Addr(),
  3. Then by ascending pfx.Bits().

E.g.

sh ip route

Gateway of last resort:
 B E      0.0.0.0/0 [200/0] via 100.124.236.40, Ethernet20.5

 B E      10.0.0.0/8 [200/0] via 100.124.236.40, Ethernet20.5
 B E      134.64.0.0/14 [200/0] via 100.124.236.40, Ethernet20.5
 B E      135.186.192.0/18 [200/0] via 100.124.236.40, Ethernet20.5
 B E      135.190.0.0/17 [200/0] via 100.124.236.40, Ethernet20.5
 B E      135.192.0.0/12 [200/0] via 100.124.236.40, Ethernet20.5
 B E      135.236.0.0/14 [200/0] via 100.124.236.40, Ethernet20.5
 B E      135.240.0.0/14 [200/0] via 100.124.236.40, Ethernet20.5
 B E      164.233.160.0/19 [200/0] via 100.124.236.40, Ethernet20.5
 B E      166.102.0.0/20 [200/0] via 100.124.236.40, Ethernet20.5
 B E      166.249.64.0/19 [200/0] via 100.124.236.40, Ethernet20.5
gaissmai commented 10 months ago

e.g. this is a compare function defined in my libraries:

// compare two prefixes and sort by the left address,
// or if equal always sort the superset to the left.
// hint: it is guaranteed that the input pfxs here are already in normalized form,
// a.k.a. netip.Prefix.Masked()
func compare(a, b netip.Prefix) int {
    // compare left points of cidrs
    ll := a.Addr().Compare(b.Addr())

    if ll != 0 {
        return ll
    }

    // ll == 0, sort superset to the left
    aBits := a.Bits()
    bBits := b.Bits()

    switch {
    case aBits < bBits:
        return -1
    case aBits > bBits:
        return 1
    }

    return 0
}

and this is the default sort order used in IP routing tables.

@danderson is it too late to stop this design error?

seankhliao commented 10 months ago

reopening to confirm this is the sort order we want to commit to

gaissmai commented 10 months ago

please see also the sort order from the python netaddr library:

Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from netaddr import *
>>> pfx1 = IPNetwork('0.0.0.0/32')
>>> pfx2 = IPNetwork('10.0.0.0/8')
>>> pfx1 < pfx2
True

in comparison to the output of this proposal:

package main

import (
    "fmt"
    "net/netip"
)

func main() {
    pfx1 := netip.MustParsePrefix("0.0.0.0/32")
    pfx2 := netip.MustParsePrefix("10.0.0.0/8")

    fmt.Println(less(pfx1, pfx2))
}

func less(a, b netip.Prefix) bool {
    return a.Compare(b) == -1
}

// Output: false

play

gaissmai commented 10 months ago

When comparing net/netip.Prefix values, the normalized prefixes must also be taken into account.

I propose the following (natural) sort order:

  1. Address family (e.g. IPv4 before IPv6),
  2. then by base IP addr of prefix: pfx.Masked().ip, (normalized)
  3. then by ascending pfx.bits,
  4. then by pfx.ip (not normalized)

luckily, the algorithm of netip.Addr.Compare() already includes the validity and address family (via Bitlen()), therefore the following steps remain:

  1. By base IP addr of prefix: pfx.Masked().ip, (normalized)
  2. then by ascending pfx.bits,
  3. then by pfx.ip (not normalized)
gaissmai commented 10 months ago

A possible implementation could look like this:

// Compare returns an integer comparing two prefixes.
// The result will be 0 if p == p2, -1 if p < p2, and +1 if p > p2.
// Prefixes sort first
// by validity (invalid before valid),
// then address family (IPv4 before IPv6),
// then prefix base address,
// then prefix length,
// then prefix address.
func (p Prefix) Compare(p2 Prefix) int {
  // compare by validity, address family and prefix base address
  if c := p.Masked().ip.Compare(p2.Masked().ip); c != 0 {
    return c
  }

  // compare by prefix length
  if c := cmp.Compare(p.bits, p2.bits); c != 0 {
    return c
  }

  // compare by prefix address
  return p.ip.Compare(p2.ip)
}
rsc commented 10 months ago

Looks like Python uses IP first, length second: https://docs.python.org/3/library/ipaddress.html#logical-operators It does not appear to make any special masking cases like @gaissmai is suggesting.

If there is an existing standard order, we should use that standard order. Are there other languages besides Python that have orderings? Rust maybe?

It may be too late in the cycle to fix this, meaning we should unexport these methods and try again next cycle. Unless there is an overwhelmingly agreed-upon order we can adopt and we figure that out very soon.

gaissmai commented 10 months ago

We should not compare apples with oranges. The Python library netaddr is more comparable to net/netip, because the IP address of a prefix does not have to be normalized.

With the python library ipaddress there are no non-normalized prefixes allowed, therefore no redundant masking in the first step needed.

Please see the introductory text https://docs.python.org/3/howto/ipaddress.html#ipaddress-howto and the chapter

Defining networks Network objects cannot have any host bits set. The practical effect of this is that 192.0.2.1/24 does not describe a network. Such definitions are referred to as interface objects since the ip-on-a-network notation is commonly used to describe network interfaces of a computer on a given network and are described further in the next section. By default, attempting to create a network object with host bits set will result in ValueError being raised. To request that the additional bits instead be coerced to zero, the flag strict=False can be passed to the constructor

Completely independent of python (but python does it right with the right library in the right context), the suggested sort order does not reflect the output of show ip route , see my proof in the playground.

gaissmai commented 10 months ago
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from ipaddress import *
>>> p1 = ip_network("0.0.0.0/32")
>>> p2 = ip_network("10.0.0.0/8")
>>> p1 < p2
True

same result as with the netaddr python library, but the current implementation of the proposal returns false

gaissmai commented 10 months ago

@danderson @rsc If I can't convince you, maybe the IANA can? https://www.iana.org/assignments/iana-ipv4-special-registry/iana-ipv4-special-registry.xhtml

take the first column as input for the Prefix.Compare() tests, and also for IPv6 (see below), and you will see your proposed algorithm fails:

https://www.iana.org/assignments/ipv6-address-space/ipv6-address-space.xhtml

and then take my proposed sort order algo und compare the tests:

  1. By base IP addr of prefix: pfx.Masked().ip, (normalize the prefix)
  2. then by ascending pfx.bits,
  3. then by pfx.ip (in case the prefix is not normalized)
rsc commented 10 months ago

It sounds like we should unexport Prefix.Compare but we can leave AddrPort.Compare. I will send a CL.

gaissmai commented 10 months ago

@rsc @danderson one more remark, this time to AddrPort.Compare:

// Compare returns an integer comparing two AddrPorts.
// The result will be 0 if p == p2, -1 if p < p2, and +1 if p > p2.
// AddrPorts sort first by IP address, then port.
func (p AddrPort) Compare(p2 AddrPort) int {
    if c := p.Addr().Compare(p2.Addr()); c != 0 {
        return c
    }
    return cmp.Compare(p.Port(), p2.Port())
}

why using inside the package the public methods? Why not just using the private struct members?

// AddrPort is an IP and a port number.
type AddrPort struct {
    ip   Addr
    port uint16
}
func (p AddrPort) Compare(p2 AddrPort) int {
    if c := p.ip.Compare(p2.ip); c != 0 {
        return c
    }
    return cmp.Compare(p.port, p2.port)
}

for me it sounds we should stop this as a whole for this cycle and do it again in the next release. Maybe Dave coded this in hurry.

Maybe a copy-paste error, since Dave used parts of this code snippet already in his own codebase, outside of net/netip

rsc commented 10 months ago

@gaissmai, are you saying there is something buggy about the AddrPort implementation? I don't see anything buggy.

It sounds like you are reading something into the fact that the code uses p.Addr() instead of p.ip and p.Port() instead of p.port. There is absolutely nothing wrong with using exported methods inside a package. In this case they will inline to the basic field accesses.

gopherbot commented 10 months ago

Change https://go.dev/cl/549615 mentions this issue: net/netip: remove Prefix.Compare for Go 1.22

gaissmai commented 10 months ago

@rsc no there is nothing buggy, but it smells in comparison to other methods in the netip code base where the private struct fields are used instead of the public methods, e.g.

func (p AddrPort) IsValid() bool { return p.ip.IsValid() }

and not

func (p AddrPort) IsValid() bool { return p.Addr().IsValid() }

it's just inconsistent and the method calls introduces unnecessary line noise and cognitive load on the reviewer due to another unnecessary indirection.

danderson commented 10 months ago

I'm one of the original authors of this package (also, it's Dave not Dan). It's funny being accused of not having the same style as myself :).

How the internals of the package access stuff really doesn't matter. Looking back at the package history, the access using internal fields is likely a holdover from a previous API of the package, where the fields of AddrPort were exported directly rather than using an accessor method. Whoever did the mechanical change to create the accessors kept the field accesses intact, presumably because it was the easiest search-and-replace to execute. It doesn't represent any complex philosophical choice about code layout, it represents an unimportant piece of the package's historical structure.

If anything, the argument about cognitive load goes in the opposite direction where performance is identical: everyone has to learn the exported API to use the package anyway, so reusing that API internally allows the reader to reuse their existing knowledge of the behavior.

The change to the prefix ordering seems fine to me. As I said originally, what I care most about is that some ordering exists and doesn't force everyone to reimplement their own (and miss edge cases like the zero address). We can make that change for the next release.

danderson commented 10 months ago

https://github.com/danderson/go/commit/4a6a7b81c687f7ab3396504f0f18900564068079 implements the revised ordering, with some additional test cases to lock the order in and verify that the sort matches IANA's ordering.

@rsc I made this commit stack on top of your unexport CL, on the assumption that I'll send it out once the tree opens for 1.23. If you'd rather sneak the corrected ordering into 1.22, happy to unstack the change and send a CL post-haste, your call.

gaissmai commented 10 months ago

@danderson thanks! I think the comparison of the Addr().BitLen() at the beginning of the Prefix.Compare method is superfluous, as this is done again first in the Addr.Compare method. Please have another look and if you disagree please correct me.

danderson commented 10 months ago

You're correct. Removed the explicit bitlen test and added a comment to point out the implicit valid+family ordering. The tests prevent a regression there anyway, but it seems useful to document for a future reader who wonders why the API docs list 5 ordering criteria but only the last 3 explicitly happen.

gaissmai commented 10 months ago

@danderson LGTM and thanks for netip

cherrymui commented 10 months ago

As Prefix.Compare is unexported in Go 1.22, it is not a release blocker for Go 1.22 now. Move to 1.23. Thanks.

gaissmai commented 4 months ago

@danderson May I ask what the status is for Prefix.Compare? What else needs to be done for it to be included in 1.23?