thom311 / libnl

Netlink Library Suite
GNU Lesser General Public License v2.1
419 stars 311 forks source link

mdb implementation inefficient #390

Open KanjiMonster opened 3 months ago

KanjiMonster commented 3 months ago

When mdb support was added, it was partially based on the misbehavior of the cache callback v2 for updated objects recently fixed.

Before the fix, the new argument contained the changes, but not the full database, so any users just needed to handle the entries in new.

But with the fix, they now need to calculate the diff between the two databases, which is not very efficient.

Additionally, since we have exactly one object per bridge that contains all multicast subscriptions, any update notification require a clone of the full database, and since there is only one object, you also cannot search for anything meaningful (e.g. if you want to get all multicast subscriptions on a certain port, you will need to get the database object, then iterate over all entries and find them yourself).

AFAICT think to make this more usable, we would need to change its design, but that is not possible in a non API breaking way. At the same time, I'm not sure we could have two different implementations for the same netlink messages.

So yeah, my first idea would be to split up the database and make each {port,vlanid} an object, similar to how fdb has neighbors and not a single fdb object per bridge.

This would fix the inefficiency, and makes it possible to search the cache. But it would be a different API.

There is also the second complication that the mdb contains two types of objects; mdb entries/subscriptions and multicast routers, which have completely separate attributes.

Not quite sure how to bring them into one cache. Or can one parser update two caches? Maybe it needs to work to similar how different link types are handled.

KanjiMonster commented 3 months ago

I'm mostly looking for some guidance here what the preferred way forward is. Implementation wise I can take care of that myself.

thom311 commented 3 months ago

At the same time, I'm not sure we could have two different implementations for the same netlink messages.

I think we cannot. Partly because the registered OPS are in global variables and system wide. Optimally, we would have a NLRunContext and users would generate a libnl context, which then could be told to use "route/mdb" or "route/mdb2".

I am unfamiliar with MDB, otherwise I may not have merged an unfavorable design. I cannot guide you in a qualified way, sorry.

Changing/breaking API is quite a problem and best avoided. But it would be possible to do, if very good reasons are given.

If the inefficiency of certain operations is a problem, we could for example track a lookup dictionary internally, to make that efficient. That may not require API changes.

KanjiMonster commented 3 months ago

I am unfamiliar with MDB, otherwise I may not have merged an unfavorable design. I cannot guide you in a qualified way, sorry.

Yeah, was mostly looking for "administrative" guidance. What is allowed, what isn't.

Changing/breaking API is quite a problem and best avoided. But it would be possible to do, if very good reasons are given.

Like the "if you give me a very good reason" part.

And I agree on not breaking API when possible, though since we (at BISDN) submitted the code originally, and nobody else complained so far, I guess (hope) that we are the only users of it anyway.

If the inefficiency of certain operations is a problem, we could for example track a lookup dictionary internally, to make that efficient. That may not require API changes.

Searching/filtering subscribers is rather something I noticed won't work at all, but we currently have no need for that.

For us the main use case is knowing what changed on an update via the callback_v2(). So which subscribers were added, which went away (so we can push these changes to hardware).

And I'm not sure a lookup dictionary would help there.

The ugly solution I could come up here is to store the last update in the object and allow retrieving it. So have something like rtnl_mdb_get_changes() which then returns the added or removed entries.

But it still would mean we do a clone of the full database on every update notification, and could become slow with a growing number of subscribers (and obviously the rising amount of allocations and deallocations).