gaissmai / bart

The Balanced Routing Table is an adaptation of D. Knuth's ART algorithm combined with popcount level compression and backtracking. It is somewhat slower than ART, but requires considerably less memory.
MIT License
30 stars 3 forks source link

Non allocating `Subnets` function with a yield similar to `All` #61

Closed nbrownus closed 2 months ago

nbrownus commented 3 months ago

I have a use case where I need to assemble all entries for a given prefix but need to do it without allocating on the heap.

Essentially what I am looking for is func (t *Table[V]) EachSubnets(pfx netip.Prefix, yield func(pfx netip.Prefix, val V) bool) that has a similar early exit option as Table.All

Would you opposed to including this API?

Thanks for this library, it looks fantastic.

gaissmai commented 3 months ago

@nbrownus I was already thinking about it, but wanted to wait for go1.23 because of the then standard iter package and range-over-functions. In any case, the sort order of the subnets will then be undefined. Does that still fit your use case?

nbrownus commented 3 months ago

Yep, that would be perfect!

gaissmai commented 3 months ago

I will implement it this weekend in the devel branch and keep you informed. But be warned, the API may still change after the release of go1.23, but the API hasn't reached v1.0.0 yet, so this should be doable.

gaissmai commented 3 months ago

@nbrownus In the devel branch you will now find the function EachSubnet(). Please test it and give feedback about the functionality and naming, maybe you can improve something as a native speaker, comments welcome.

nbrownus commented 3 months ago

Had a quick look and realized what I asked for was entirely wrong for my use case :(

What I actually need is something like func (t *Table[V]) EachMatch(ip netip.Addr, yield func(pfx netip.Prefix, val V) bool) which would give me each netip.Prefix in the table that contains ip until the yield func returns false.

Sorry about that

gaissmai commented 3 months ago

hm, sounds like EachSupernet(pfx netip.Prefix, yield func(pfx netip.Prefix, val V) bool), and if you need this just for IPs, then convert them to /32 or /128 prefixes at the call site with netip.PrefixFrom(ip, ip.BitLen())

You can test it with the current Supernets() if the result is what you need.

gaissmai commented 3 months ago

please see and test the new method in the devel branch:

func (t *Table[V]) EachSupernet(pfx netip.Prefix, yield func(pfx netip.Prefix, val V) bool)
    EachSupernet calls yield() for each CIDR covering pfx in natural CIDR sort
    order.

    If the yield function returns false, the iteration ends prematurely.
gaissmai commented 3 months ago

The devel branch is now merged into main and tagged with v0.11.0, you can test now your use case with a normal import.

gaissmai commented 3 months ago

v0.11.1 is released, your needed function is now renamed and has an improved practical sort order:

func (t *Table[V]) EachLookupPrefix(pfx netip.Prefix, yield func(pfx netip.Prefix, val V) bool)
    EachLookupPrefix calls yield() for each CIDR covering pfx in reverse CIDR
    sort order, from longest-prefix-match to shortest-prefix-match.

    If the yield function returns false, the iteration ends prematurely.

the first call to yield is with the longest-prefix-match and so on. If you early stop in the yield function after the first call, the result is identical to LookupPrefixLPM, just 20% slower but with the added benefit to yield all matching CIDRs in reverse sort order.

The function EachLookupPrefix does also not allocate.

tobez commented 2 months ago

@gaissmai , hmm, but why did you you remove allocating Subnets() and Supernets() though? Sometimes one needs to iterate, sometimes one needs to gather. Of course it's easy to do a for ... range ... append to get the slices, but it was convenient to have it there.

gaissmai commented 2 months ago

As you write, it's easy to collect them, just write a wrapper function.

And, especially Subnets can grow huge, you can get the whole trie back if you are not careful. In contrast, for Supernets there is a maximum of 33 matching prefixes (IPv4) or 129 prefixes (IPv6). But anyway, I would have to support two functions in an ever growing API with similar or identical use cases.

The user can implement it with few lines of code and much more memory efficient. But sure, it can be discussed, maybe you can convince me.

tobez commented 2 months ago

To be honest, I do not have a use case for Subnets. For Supernets, however, the use case is as follows.

There is a moderately expensive operation on a value V that needs to be implemented for each prefix containing a given IP/prefix, while not hindering the updates too much. The operation is not cheap enough to just do it all in one fell swoop under a lock, and is not expensive enough to do a deep copy of V (which is expensive by itself) and use channels to offload the computation. So: read lock, call Supernets, unlock, loop through Supernet slices, within the loop: read lock, do LookupPrefix, do computation, unlock.

But collecting the supernets from the iterator is easy, so I am not insisting on Supernets() resurrection.