Core Network Buffer management rework

tbursztyka commented 6 years ago

Current buffer management is far from being optimized and requires a complete rework, moreover now that net_buf allows variable length allocator (or even custom one)

[x] Serialize 15.4/BT/6lo packet transmission i.e.: instead of fragmenting/compression/framing the whole packet at once, let's do it fragment after fragment (if transmission fails early, at least host cpu won't have lost time in preparing the whole packet). This will be necessary for the next task.
[x] Fully separate L2 header memory area from IP memory area No more LL reserve in net_pkt nor in IP core area. This will reduce the work load and memory usage.
[x] Switch net_pkt core to variable length allocator for the data buffers (net_pkt structures are still allocated on a slab, no nee to change that
[ ] (optional) Have a custom allocator that could manager allocating less than requested as long as it fits with a proper minimum (relevant only when sending) This would allow sending packet even if memory is heavily used.

the 2 first items are almost fully done. There is delays to port all the unit tests as it changes many things (how L2s behaves obviously).

3rd item had already a prototype, but has been delayed. (for various reasons, but major one being how to properly split it in relevant incremental patches)

tbursztyka commented 6 years ago

@jukkar , @rveerama1, @pfl , @laperie

pfalcon commented 6 years ago

Fully separate L2 header memory area from IP memory area No more LL reserve in net_pkt nor in IP core area.

How about the opposite - always reserve enough space in the packet data for all headers required to construct the packet, using worst case. If there're less headers, then only the headers can be "shifted" towards the data to form a contiguous block of memory. Doing so will ensure that such a packet, once all data added to it, will be possible to send immediately, instead of possibly being blocked (maybe even deadlocked) in allocation of fragment(s) for headers.

tbursztyka commented 6 years ago

That's what we currently do, and that won't fit with BT and 15.4 anymore (currently frame size == frag size, set at built time, but then all memory fragments have LL reserve which costs memory, plus other details)

Also, packets are not sent "immediately", they are always queued. It's better not to consume memory too early, when it could be used by rx or current packet being transmitted.

pfalcon commented 6 years ago

@tbursztyka :

How about the opposite - always reserve enough space in the packet data for all headers

That's what we currently do

Not entirely and not consistently. E.g., for TCP, we start with adding data fragments to packet, and only then insert TCP header in front of it: https://github.com/zephyrproject-rtos/zephyr/blob/master/subsys/net/ip/tcp.c#L446 . You may imagine what happens if, when we added data, we exhausted all fragment buffers: we'll wait, then fail. (And even that is an improvement: some time ago, we waited indefinitely in many such cases, and some time ago too, we were keen to add unlimited amount of data (re: https://github.com/zephyrproject-rtos/zephyr/pull/119)).

but then all memory fragments have LL reserve which costs memory, plus other details)

Sure, setting header reserves on the level of fragments isn't good idea (unless such are needed for very specific LLs), but the talk is about setting header reserves on the level of a packet.

Also, packets are not sent "immediately", they are always queued.

Sure, that's what I meant. After they are "queued", it should be possible to assume that they will be sent out, and e.g. they won't require additional memory allocations (which may fail).

It's better not to consume memory too early,

Depends on the level of reliability of the system you want to build. It's impossible to send IP packet without headers, so it makes sense to start allocating packet with reserving space for headers, then add as much data as possible/makes sense (which is fully valid way for TCP). Doing it the other way around means conflicting for memory (and entails deadlocks or performance issues).

when it could be used by rx or current packet being transmitted.

Well, Zephyr has separate pools for TX and RX packets/bufs, and that was the baseline requirement to avoid deadlocks. And per above "current packet being transmitted" is already in the queue, so should not require any more allocations (or there even worse cases for deadlocking/random delays).

tbursztyka commented 6 years ago

You are mixing up everything here. I am talking about memory allocation, you are talking of populating the allocated data (TCP). I am talking about L2 layer, you are talking about TCP/IP, etc...

Splitting L2 layer header allocation is going to save memory in 15.4 and BT which are meant to run on small SoC. Introducing the net_buf user data reference, it could also mean that L2s don't need to actually allocate anything from memory pool but make use of the stack etc... The risk of not being able to send the packet due to memory unavailability can be obliterated.

pfalcon commented 6 years ago

@tbursztyka :

You are mixing up everything here. I am talking about memory allocation, you are talking of populating the allocated data (TCP). I am talking about L2 layer, you are talking about TCP/IP, etc...

Yeah, I'm talking about how IP stack works in general, and that's pretty much mixed-up and complicated already, and share a concern that instead of pinpointing and resolving existing problems, we're adding only more complexity (variable-sized fragment buffers is a rather non-trivial complexity, as was argued yet in @jhedberg's original PR).

The risk of not being able to send the packet due to memory unavailability can be obliterated.

Sounds good, we'll see how it goes.

mike-scott commented 6 years ago

What's the current status on this?

As the subsystems try and move forward with some of the other churn, there seems to be some confusion as to whether we should be using net_buf more or .. less?

tbursztyka commented 6 years ago

@mike-scott net_buf is still going to be used, it's the low level descriptor of the buffers. [edit] in net core parts that is, on user API side it will be only socket API thus user allocated contiguous buffers.

jukkar commented 5 years ago

I think we have this PR already covered as the networking code has been converted to use new net_pkt APIs and ideas, so closing this.

zephyrproject-rtos / zephyr

Core Network Buffer management rework #7578