gaissmai / bart

The Balanced Routing Table is an adaptation of D. Knuth's ART algorithm combined with popcount level compression and backtracking. It is somewhat slower than ART, but requires considerably less memory.
MIT License
34 stars 3 forks source link

bart: extend functionality to make it more useful for implementing a RIB #22

Closed tobez closed 6 months ago

tobez commented 6 months ago

I am thinking about these new functions:

// Find exact match to pfx, or insert pfx.
// Returns pointer to value, and whether pfx was inserted.
func (t *Table[V]) FetchOrInsert(pfx netip.Prefix) (*V, bool) {...}

This will allow to perform modifications to existing values. Insert() can then be implemented like

func (t *Table[V]) Insert(pfx netip.Prefix, val V) {
    vp, _ := t.FetchOrInsert(pfx)
    *vp = val
}
// Find  like Lookup but by prefix
// Find does a route lookup for prefix and returns the longest prefix,
// pointer to the associated value and true for success, or false otherwise if
// no route matched.
func (t *Table[V]) Find(pfx netip.Prefix) (lpm netip.Prefix, pv *V, ok bool) {...}

// FindAll  Slice of all matching prefixes and pointers to their values, longest first
func (t *Table[V]) FindAll(pfx netip.Prefix) []struct { pfx netip.Prefix, pv *V }  {...}

I have a ready implementation for fetchOrInsert() but not for the others.

tobez commented 6 months ago

I hit "send" prematurely, so I had to edit the original issue text. My apologies.

gaissmai commented 6 months ago

What should FetchOrInsertbe useful for? At most I could imagine a Fetchmethod or let's call it Find, which searches for the exact prefix in the routing table.

But actually bart is a routing table algorithm for longest-prefix-match, the user can also hold in another data structure the information which tuple (prefix/value) he holds in the routing table.

gaissmai commented 6 months ago

What is the use case for Find(prefix) and FindAll(prefix)?

And the name Find is also unfortunate, as it searches for the longest-prefix, Findsuggests to me an exact match.

tobez commented 6 months ago

FetchOrInsert (or FindOrInsert, I am not insisting on the exact naming) is essentially an optimization and can easily be replaced by the two calls, to ExactFind and to Insert if not found, as you mention. The reason I think it is worth to have it as a standalone call is to avoid repeating 95% of the operations twice, since the execution of both will be identical until the final decision point.

The use case for it is for example for a BGP implementation or a BMP listener getting information from multiple peers/vantage points and collating this information together in a single table (instead of keeping one per peer). In this scenario, a lot of the operations will be updates to the existing structure associated with a given prefix, with occasional inserts. Keeping such information on a side is possible but a waste of memory since the existing RT is perfectly adequate.

Similarly, LPMFind(prefix) (for the lack of a better name; we are running out of verbs) is quite essential since without it the whole implementation is kinda write-only. With only Get(ip) or Lookup(ip) BART is only ever useful for implementing a routing table in the strictest sense of the term.

While ExactFind(prefix) can be emulated by LCMFind(prefix) plus a check whether we overshot, its presence makes the whole API a bit more rounded. Also, it is probably slightly faster.

After some thought, I find that FindAll(prefix) is a bit niche, as it can be easily simulated by a loop of LPMFind(prefix) calls. (Found 10.0.0.0/24 ? Now do LPMFind("10.0.0.0/23"). Found 10.0.0.0/8? Now do LPMFind("10.0.0.0/7"). Not found? Stop.)

So, to sum it up, I think that LCMFind(prefix) is essential in general, either StrictFind(prefix) or FindOrInsert(prefix) or both are needed for the use case I have in mind, and let's drop the FindAll().

gaissmai commented 6 months ago

Insert() has already the update feature, maybe the naming should be Update() instead of insert? What is missing is the Value() instead of ExactFind(), if really needed.

What do you think about this API naming and feature change? We are still below v1.0.0, an API change is cheap and the user base is small.

  func (t *Table[V]) Update(pfx netip.Prefix, new V) (old V, ok bool)
  func (t *Table[V]) Value(pfx netip) (val V, ok bool)

Insert() could be marked as deprecated.

Let us first discuss this part of your proposal, we will come back to the prefix lookup methods afterwards.

tobez commented 6 months ago

I have no issue with the naming, but there's a slight problem with the API. For example, using pointer-based FetchOrInsert() a simple counter like this is trivial:

var poInput = []netip.Prefix{
        p("10.0.0.0/8"),
        p("10.0.0.0/16"),
        p("10.0.0.0/24"),
        p("10.0.1.0/24"),
        p("10.0.0.0/16"),
        p("10.0.0.0/24"),
        p("10.0.1.0/24"),
        p("10.0.0.0/24"),
        p("10.0.1.0/24"),
        p("10.0.1.0/24"),
}

...

       it := new(Table[int])
        for _, item := range poInput {
                iv, found := it.FetchOrInsert(item)
                *iv++
        }
        it.Fprint(os.Stdout)
▼
└─ 10.0.0.0/8 (1)
   └─ 10.0.0.0/16 (2)
      ├─ 10.0.0.0/24 (3)
      └─ 10.0.1.0/24 (4)

And with the proposed Update() the problem is that the new value cannot be calculated from the old value because it needs to be known beforehand. So we either need to call Value() first - which will lead to doing the same traversal twice, or we need V to be a pointer itself, and then supply a pointer to zero value as new, and then analyze and potentially copy a lot of information from old. Incidentally, that was the reason I did not include new in the parameters to FetchOrInsert.

I have no issue with Value().

gaissmai commented 6 months ago

But this seems to me to be a very special problem of yours, that the payload is only an integer value and should simply be incremented. It would be better to solve this with a different data structure.

tobez commented 6 months ago

This was just an example which was aiming to illustrate the problem of being unable to derive the new value from the old value, in part or in full.

gaissmai commented 6 months ago

I'm a fan of KISS methods, in naming and in function, I don't like IfThenOr methods, your use case is still very special, I prefer Value() followed by Insert() for your use case.

The Value() method can be implemented without memory allocation. The Value() is very fast, faster than Get/Lookup (no backtracking needed).

If you have big payloads, the value V itself should be a pointer.

tobez commented 6 months ago

I am not sure I agree entirely with having to call both Value() and then Insert() if not found, but I guess I can live with that, so let's consider this settled. Do you want an implementation from me, or is this something you intend to implement yourself?

And now to the LPM prefix lookup. :-)

Do you think just adding

func (t *Table[V]) PrefixLookup(pfx netip.Prefix) (lpm netip.Prefix, val V, ok bool)

is feasible?

gaissmai commented 6 months ago

I'll will implement it, Value() is nearly the same as Delete(), just simpler.

And I'll implement the prefix LPM lookup, the API method name is not fixed yet. See for example also my cidrtree package or the interval package, maybe I'll choose the same names when possible.

gaissmai commented 6 months ago

Value() is implemented in the devel branch, would you like to review it?


What do you think of the following idea to implement your proposal for FetchOrInsert after all?

func (t *Table[V]) Update(pfx netip.Prefix, callback func(V, bool) V)

The update method has a callback function as a parameter in addition to the prefix. The callback function receives the (value, ok) for the prefix and returns the modified or new value, which is then set as the payload for the prefix in the routing table.

tobez commented 6 months ago

Value() is implemented in the devel branch, would you like to review it?

It looks good. I tested this version with my code storing 26375555 prefixes (1208207 unique ones), and it worked fast and correct.

What do you think of the following idea to implement your proposal for FetchOrInsert after all?

func (t *Table[V]) Update(pfx netip.Prefix, callback func(V, bool) V)

The update method has a callback function as a parameter in addition to the prefix. The callback function receives the (value, ok) for the prefix and returns the modified or new value, which is then set as the payload for the prefix in the routing table.

That would be great to have, absolutely. It gives the functionality while avoids dealing with ugliness with pointers.

tobez commented 6 months ago

Have you considered adding a walk (or maybe two, depth-first and breadth-first)?

func (t *Table[V]) Walk(callback func(netip.Prefix, V) bool) int

Callback returning false would terminate the walk. The result would be the number of prefixes visited.

Having it there would probably make stringification and jsonification simpler, but would be useful beyond that.

gaissmai commented 6 months ago

Have you considered adding a walk (or maybe two, depth-first and breadth-first)?

func (t *Table[V]) Walk(callback func(netip.Prefix, V) bool) int

Callback returning false would terminate the walk. The result would be the number of prefixes visited.

The implementation should be simple, since the MarshalXXX is already doing a walk. Will you contribute it?

tobez commented 6 months ago

The implementation should be simple, since the MarshalXXX is already doing a walk. Will you contribute it?

Sure, I'll give it a shot.

gaissmai commented 6 months ago

The implementation should be simple, since the MarshalXXX is already doing a walk. Will you contribute it?

Sure, I'll give it a shot.

The question is: how is depth-first or breadth-first defined for CIDRs ;)

gaissmai commented 6 months ago

metrics and dump are also already walking the trie, just to get some inspiration.

tobez commented 6 months ago

The question is: how is depth-first or breadth-first defined for CIDRs ;)

It's still essentially a trie, so those things can be defined, if a bit loosely. Something like, go through the stride before recursing = breadth-first, recurse as soon as possible = depth-first. But it is a bit contrieved, so a single walk using a "natural order" should suffice.

gaissmai commented 6 months ago

back to the update proposal, which callback function signature fits better for your use cases: 1.)

func (t *Table[V]) Update(pfx netip.Prefix, callback func(V, bool) V)

2.)

func (t *Table[V]) Update(pfx netip.Prefix, callback func(netip.Prefix, V, bool) V)

background, with signature nr. 2 the callback would have the whole context: (pfx, value, ok) independent from the callsite

tobez commented 6 months ago

I think (1) is quite sufficient since the prefix can be made readily available if needed via using a closure as a callback.

gaissmai commented 6 months ago

next question, what do you prefer?

1.)

func (t *Table[V]) Update(pfx netip.Prefix, callback func(V, bool) V)

2.)

func (t *Table[V]) Update(pfx netip.Prefix, callback func(V, bool) V) (val V)

returning the updated value to the callsite

tobez commented 6 months ago

Maybe (2) is slightly more useful here.

gaissmai commented 6 months ago

I just updated the devel branch with a work in progress implementation, no tests yet. Would you be so kind and review it?

gaissmai commented 6 months ago

Have you considered adding a walk (or maybe two, depth-first and breadth-first)?

func (t *Table[V]) Walk(callback func(netip.Prefix, V) bool) int

Callback returning false would terminate the walk. The result would be the number of prefixes visited.

Having it there would probably make stringification and jsonification simpler, but would be useful beyond that.

I don't think so, since stringification is a very complex recdescent monster, not just a simple recursive walk. The order from top to bottom is in ascending order of the prefix address but the subtree structure is determined by the CIDRs coverage, this is really very tricky. But maybe, you find a simpler solution.

gaissmai commented 6 months ago

devel branch: Update() is now as fast as Insert(), and a test is also included.

gaissmai commented 6 months ago

devel branch: LookupPrefix() is implemented

gaissmai commented 6 months ago

devel branch: LookupPrefixAll() is implemented, comments about naming and function signature welcome.

gaissmai commented 6 months ago

Now with the lookup-all method, returning all matching prefixes, I'll delete the LookupShortest(ip), it was just a special use case for me, now also possible with LookupPrefixAll(pfx). Sure, LookupPrefixAll can't be super fast, but I think, we must be as fast as possible with Get(ip)/Lookup(ip)/LookupPrefix(pfx) and Insert, Update, Delete, the others are convenience methods with a lot of memory allocations anyway.

gaissmai commented 6 months ago

@maisem @tobez @s3rj1k @mieczkowski

Im not happy with the naming and signatures of the API.

I propose the following simplification:

// longest prefix match
func (t *Table[V]) Get(ip netip.Addr)       (                  val V, ok bool) // input ip
func (t *Table[V]) Lookup(pfx netip.Prefix) (lpm netip.Prefix, val V, ok bool) // input pfx, get the lpm too
func (t *Table[V]) Supernets(pfx netip.Prefix) []netip.Prefix  // all prefixes covering pfx, sorted

func (t *Table[V]) Value(pfx netip.Prefix) (val V, ok bool)    // getter for pfx, or...
func (t *Table[V]) Find(pfx netip.Prefix)  (val V, ok bool)    // ...exact match, instead of Value() ???

Get() is for normal and fast(est) routing queries, the rest tells more about the structure of the routing table and used for table handling itself by routing processes or needed as IPAM functions.

Comments welcome.

s3rj1k commented 6 months ago

IMO: As long as changelog for tagged release provides hints on breaking changes / migration, everything should be fine.

gaissmai commented 6 months ago

Have you considered adding a walk (or maybe two, depth-first and breadth-first)?

func (t *Table[V]) Walk(callback func(netip.Prefix, V) bool) int

I need it already for Subnets(), will implement it somehow itself.

tobez commented 6 months ago

@maisem @tobez @s3rj1k @mieczkowski

Im not happy with the naming and signatures of the API.

I propose the following simplification:

// longest prefix match
func (t *Table[V]) Get(ip netip.Addr)       (                  val V, ok bool) // input ip
func (t *Table[V]) Lookup(pfx netip.Prefix) (lpm netip.Prefix, val V, ok bool) // input pfx, get the lpm too
func (t *Table[V]) Supernets(pfx netip.Prefix) []netip.Prefix  // all prefixes covering pfx, sorted

func (t *Table[V]) Value(pfx netip.Prefix) (val V, ok bool)    // getter for pfx, or...
func (t *Table[V]) Find(pfx netip.Prefix)  (val V, ok bool)    // ...exact match, instead of Value() ???

Get() is for normal and fast(est) routing queries, the rest tells more about the structure of the routing table and used for table handling itself by routing processes or needed as IPAM functions.

Comments welcome.

I've been wondering whether Get() and Value() can be neatly polymorphed into a single Get(), taking either an Addr or a Prefix? "Value" is not a verb, and Get/Find/Lookup gets confusing, fast.

Is there something better than a plain interface{} and a type switch in the implementation? Even that would work nicely I think, without loosing performance in any meaningful way.

gaissmai commented 6 months ago

please see the current tip in devel, it has currently the following 'streamlined' API, Value is now called Get,it's just a getter.

And Lookup is (currently) the single longest-prefix-match:

  func (t *Table[V]) Insert(pfx netip.Prefix, val V)
  func (t *Table[V]) Delete(pfx netip.Prefix)
  func (t *Table[V]) Get(pfx netip.Prefix) (val V, ok bool)
  func (t *Table[V]) Update(pfx netip.Prefix, cb func(val V, ok bool) V) V

  func (t *Table[V]) Union(o *Table[V])
  func (t *Table[V]) Clone() *Table[V]

  func (t *Table[V]) Lookup(ip netip.Addr) (val V, ok bool)
  func (t *Table[V]) Subnets(pfx netip.Prefix) []netip.Prefix
  func (t *Table[V]) Supernets(pfx netip.Prefix) []netip.Prefix

  func (t *Table[V]) Overlaps(o *Table[V]) bool
  func (t *Table[V]) OverlapsPrefix(pfx netip.Prefix) bool

  func (t *Table[V]) String() string
  func (t *Table[V]) Fprint(w io.Writer) error
  func (t *Table[V]) MarshalText() ([]byte, error)
  func (t *Table[V]) MarshalJSON() ([]byte, error)

  func (t *Table[V]) Walk(cb func(pfx netip.Prefix, val V) bool) bool
  func (t *Table[V]) Walk4(cb func(pfx netip.Prefix, val V) bool) bool
  func (t *Table[V]) Walk6(cb func(pfx netip.Prefix, val V) bool) bool

  func (t *Table[V]) DumpList4() []DumpListNode[V]
  func (t *Table[V]) DumpList6() []DumpListNode[V]
tobez commented 6 months ago

please see the current tip in devel, it has currently the following 'streamlined' API, Value is now called Get,it's just a getter.

Neat.

And Lookup is (currently) the single longest-prefix-match:

  func (t *Table[V]) Get(pfx netip.Prefix) (val V, ok bool)
  func (t *Table[V]) Lookup(ip netip.Addr) (val V, ok bool)

But what do I do if I want to LPM a Prefix?

gaissmai commented 6 months ago

Neat.

And Lookup is (currently) the single longest-prefix-match:

  func (t *Table[V]) Get(pfx netip.Prefix) (val V, ok bool)
  func (t *Table[V]) Lookup(ip netip.Addr) (val V, ok bool)

But what do I do if I want to LPM a Prefix?

do you really need a fast LPM lookup of a prefix?

For what use case? If it must not be super fast, you could take the last element of the Supernets() returned slice followed by a Get(lpm).

Another possibility would of course be to have only one Lookup(pfx) method that takes a prefix, because the caller can of course always turn an IP into a /32 or /128 prefix, but I think that you are mainly looking for IP addresses in a routing table and less for prefixes then this would be inconvenient and takes also some ns.

If you really need a Lookup(pfx), then please give me a suitable method name.

tobez commented 6 months ago

What I was thinking is having Lookup accept both, something like:

func (t *Table[V]) Lookup(p interface{}) (val V, ok bool) {
    var pfx netip.Prefix
    switch pp := p.(type) {
    case netip.Addr:
        if pp.Is4() {
            pfx = pp.Prefix(32)
        } else {
            pfx = pp.Prefix(128)
        }
    case netip.Prefix:
        pfx = pp
    default:
        // should not happen
        return
    }
    // the rest of the Lookup based on Prefix
    ...
}
gaissmai commented 6 months ago

hmm, giving up compile time errors for convenience and having generics and interfaces in one library is not my favourite. And I loose some ns

Again my question: do you really need a superfast Lookup(pfx) ?

tobez commented 6 months ago

hmm, giving up compile time errors for convenience and having generics and interfaces in one library is not my favourite. And I loose some ns

Fair enough.

Again my question: do you really need a superfast Lookup(pfx) ?

No not really, having Supernets() + Get() should be enough.

gaissmai commented 6 months ago

No not really, having Supernets() + Get() should be enough.

yeah, please give it a try and if not suitable, we can still re-implement LookupPrefix(pfx) even if it's an ugly name ;)

gaissmai commented 6 months ago

branch devel: rebased and squashed intermediate WIPs, tests are implemented, also some Benchmarks for the new methods.

Ready to test for the interested reader. Please fetch it again

gaissmai commented 6 months ago

@tobez sorry, could not resist:

func (t *Table[V]) Lookup2(pfx netip.Prefix) (lpm netip.Prefix, val V, ok bool)
    Lookup2 is similar to Lookup, but has a prefix as input parameter and
    returns the lpm prefix in addition to value,ok.

    This method is about 15-20% slower than Lookup and should only be used if
    you either explicitly have a prefix as an input parameter or the prefix of
    the matching lpm entry is also required for other reasons.

    If Lookup2 is to be used for IP addresses, they must be converted to /32 or
    /128 prefixes.
s3rj1k commented 6 months ago

@tobez sorry, could not resist:

func (t *Table[V]) Lookup2(pfx netip.Prefix) (lpm netip.Prefix, val V, ok bool)
    Lookup2 is similar to Lookup, but has a prefix as input parameter and
    returns the lpm prefix in addition to value,ok.

    This method is about 15-20% slower than Lookup and should only be used if
    you either explicitly have a prefix as an input parameter or the prefix of
    the matching lpm entry is also required for other reasons.

    If Lookup2 is to be used for IP addresses, they must be converted to /32 or
    /128 prefixes.

I think 2 is misleading when module is about IP :)

gaissmai commented 6 months ago

I think 2 is misleading when module is about IP :)

hmm, IP versions 6-4=2 joking aside, I don't understand what you mean

ah, I've overseen your smiley

s3rj1k commented 6 months ago

I think 2 is misleading when module is about IP :)

hmm, IP versions 6-4=2 joking aside, I don't understand what you mean

Just call it SlowLookup or anything other that Lookup2 :) where is 2 there can be 3 or even 4 and 4 is IP version)

or maybe

func (t *Table[V]) Lookup(pfx ...netip.Prefix) (lpm netip.Prefix, val V, ok bool)

?

gaissmai commented 6 months ago

50ns instead of 40ns is not really slow for a full Internet IPv4 routing table, the additional ns comes from getting back the lpm prefix since the prefix is not stored as a whole value in the datastructure

gaissmai commented 6 months ago

or maybe

func (t *Table[V]) Lookup(pfx ...netip.Prefix) (lpm netip.Prefix, val V, ok bool)

not possible with go, we have now the following LPM API:

  func (t *Table[V]) Lookup(ip netip.Addr) (val V, ok bool)
  func (t *Table[V]) Lookup2(pfx netip.Prefix) (lpm netip.Prefix, val V, ok bool)
tobez commented 6 months ago

Just LookupPrefix(), mayhaps?

s3rj1k commented 6 months ago

not possible with go

variadic input? why?

gaissmai commented 6 months ago

not possible with go

variadic input? why?

not variadic input, different signatures for input parameters. The first and fastest default Lookup() takes an IP address and Lookup2 a prefix AND has an additional result value.