Split nanostack heap into using 2 different books

daniel-cesarini-tridonic-com commented 6 years ago

One book is used for temp allocations;
the other for more "stack" related allocations.

@karsev @MarceloSalazar @SeppoTakalo @kjbracey-arm

SeppoTakalo commented 6 years ago

What is the benefit of this split?

Internally the heap has always been used in two sections. One allocation block growing from bottom to top, other block growing from top to bottom.

If we split, it forces us to accurately calculate how much we would need on each of the block, which in non-optimal case would lead to more unused memory in "wrong" block.

kjbracey commented 6 years ago

I don't like the idea of forced separation - there are too many advantages to being able to share limited memory. Being forced to configure separate pools is a pain.

You could maybe add 1 layer of indirection, such that Nanostack can be given two book pointers to use for temporary and permanent. These could either be the same book, or different books. That would suit both those who want them separated and those who don't.

Also, to improve single-book use, I would be in favour of some sort of optional limits for each end - eg "temporary allocations must leave a gap of X above current highest permanent allocation", or "permanent allocations must leave at least X total space".

daniel-cesarini-tridonic-com commented 6 years ago

This PR is a proposal, I could try to rework it such that either it can be configured at compile time whether to use single or dual books.

The idea behind it is to have a way of isolating the two parts such that failures to allocate can be confined to packets / temporary part of the heap, while making sure that the "non-temporary" part (I called it "static" until now, I am open for naming proposals) would ALWAYS have enough space, because as stated by https://github.com/SeppoTakalo in https://github.com/ARMmbed/mbed-os/issues/5759#issuecomment-355028522 that part is not able to handle failures (from what I understood).

OR what other ideas / proposals do you have?

daniel-cesarini-tridonic-com commented 6 years ago

Additionally, during testing, I noticed that the "static"/"permanent" part of the heap is also doing free and alloc during normal operation (at least in case of errors). What are the consequences of this?

In general, I would not mix packet handling with stack functionality: a networked system should be able to withstand high traffic, by dropping packets / slowing down / worsening QoS, but it should NOT break / loose functionality.

kjbracey commented 6 years ago

The allocations during normal operation should just be for routing/control type information. Route management is itself quite dynamic, so it is in theory possible to exhaust that permanent memory given a complex enough network environment (too many routes, too many prefixes, too many addresses)

The intent was always more that this dynamic control information can spill into the temporary area, saving precise configuration of "how many routes can I handle".

Having a requirement that the temporary side always leave a certain amount free seems like it should work quite well to preserve the basic idea that increasing control complexity gradually diminishes buffer RAM, without the buffer RAM being able to block expansion of the control area.

daniel-cesarini-tridonic-com commented 6 years ago

The allocations during normal operation should just be for routing/control type information. Route management is itself quite dynamic, so it is in theory possible to exhaust that permanent memory given a complex enough network environment (too many routes, too many prefixes, too many addresses)

Is there a way to size the needs of memory for a given network? Given the maximum number of nodes, the maximum degree of connectivity (meshing), other affecting parameters the network, it should be possible to provide an exact calculation of the needed space, or?

Having a requirement that the temporary side always leave a certain amount free seems like it should work quite well to preserve the basic idea that increasing control complexity gradually diminishes buffer RAM, without the buffer RAM being able to block expansion of the control area.

Do you have particular mechanism in mind for this?

kjbracey commented 6 years ago

Is there a way to size the needs of memory for a given network? Given the maximum number of nodes, the maximum degree of connectivity (meshing), other affecting parameters the network, it should be possible to provide an exact calculation of the needed space, or?

It is hard to calculate, which is why we try to avoid doing it.

For non-border-router nodes, it shouldn't depend on number of mesh nodes or links - it will be about number of off-mesh routes, border routers and prefixes. "Mesh neighbour" and "mesh route" information is automatically clamped by cache limits. A device in a 2-node mesh will survive with less memory than one in a 10-device mesh, but being in a 2000-device mesh won't need more than 10.

So I'd say you can only monitor by experimentation - whatever you measure for a given border router setup on a node which a pool of 6+ neighbours should remain basically constant for a given software version.

Adding more border routers or specific routes or prefixes or whatever other Thread-specific data there is will increase the configuration storage.

The one other area that occurs to me now is supporting reduced-function devices as children. I forget what the limits are there - a certain amount of RAM is required to buffer packets to them if sleepy.

I believe we should have some guidelines somewhere about how big the Nanostack heap should be configured for typical Thread device classes (@mikter ?) we've tended to avoid pushing this that hard.

kjbracey commented 6 years ago

Do you have particular mechanism in mind for this?

Basically just

void *ns_dyn_mem_temporary_alloc(size)
{
       if (book.total_current_free_space < book.bytes_to_keep_free_for_permanent) {
              return NULL;
       }

Not sure how easy that is without looking at the code.

kjbracey commented 6 years ago

we've tended to avoid pushing this that hard

Which basically explains the heap setup - we size the heap to have "plenty" of space for buffers with an expected network complexity. If the complexity increases, buffer space is reduced, but it would have to be an extraordinarily unusual network to leave us with no buffer space.

mikter commented 6 years ago

We don't know the amount of memory we are going to use beforehand we don't want to bind these sizes. The reason is as Kevin says it is really hard to say how much we are going to use memory.

Thread has some conformance requirements and the amount of memory we have allocated for the stack does go through these tests.

At least this change must not be default, but must be done as extension to heap. I don't think this is better solution for default operation and can only be used in very special cases where you know exactly where you are deploying your devices and have resources to tune the values.

The extension that Kevin suggested that there is extra headroom allocated to static data seems good to me and can be easily added as extension.

on the issue of deallocating static blocks during operation is because those blocks have live time of tens of seconds to years depending on network condition so we can not allocate those from temporary pool. this might lead to fragmentation over time.

daniel-cesarini-tridonic-com commented 6 years ago

Basically just
{
if (book.total_current_free_space < book.bytes_to_keep_free_for_permanent) {
return NULL;
} 
Not sure how easy that is without looking at the code.

One question: where exactly is now the latest code for nanostack-libservice? Is it in this very repo, or in mbed-os, or in mbed-client-cloud, or somewhere else?

If I refer to https://github.com/ARMmbed/nanostack-libservice/blob/master/source/nsdynmemLIB/nsdynmemLIB.c (in this repo), then the code for ns_mem_temporary_alloc is like this:

void *ns_mem_temporary_alloc(ns_mem_book_t *heap, ns_mem_block_size_t alloc_size)
{
    return ns_mem_internal_alloc(heap, alloc_size, 1);
}

and the heap parameter (that is the book) contains following fields:

struct ns_mem_book {
    ns_mem_word_size_t     *heap_main;
    ns_mem_word_size_t     *heap_main_end;
    mem_stat_t *mem_stat_info_ptr;
    void (*heap_failure_callback)(heap_fail_t);
    NS_LIST_HEAD(hole_t, link) holes_list;
    ns_mem_heap_size_t heap_size;
};

So, how to implement the check for "leave enough free space" is not so straightforward. At first I also did try to go along that line, implementing it inside of ns_mem_internal_alloc. We need to consider the hole list? Or should we "simply" use the position of the cur_hole pointer to decide whether to return or not? Basically limiting the position of entries for "temporary allocations"?

Thank you for the help!

daniel-cesarini-tridonic-com commented 6 years ago

Referring to the other topic that arose in here

Is there a way to size the needs of memory for a given network? Given the maximum number of nodes, the maximum degree of connectivity (meshing), other affecting parameters the network, it should be possible to provide an exact calculation of the needed space, or?

It is hard to calculate, which is why we try to avoid doing it.

I do not share the view that "because it is difficult we skip doing it". There should be some way of correctly sizing memory areas, otherwise to design such networked systems we either need to completely overestimate needs, or just choose some "seemingly good values" and then "plug-and-pray"

Furthermore:

So I'd say you can only monitor by experimentation - whatever you measure for a given border router setup on a node which a pool of 6+ neighbours should remain basically constant for a given software version.

Is it really possible that there is no exact and offline method to size the size of needed heap? Again, this strengthens by view of the necessity of the split of the memory (at least from a point of view of "reserving" a part), in order to isolate components!

kjbracey commented 6 years ago

One question: where exactly is now the latest code for nanostack-libservice?

This repo is the master copy. Changes here will be git subtree pulled into the mbed OS repo.

So, how to implement the check for "leave enough free space" is not so straightforward.

mem_stat_info_ptr has the stats. I believe, heap_sector_size - heap_sector_allocated_bytes should be the free space.

daniel-cesarini-tridonic-com commented 6 years ago

@kjbracey-arm : how to deal with fragmentation if doing such check on:

mem_stat_info_ptr has the stats. I believe, heap_sector_size - heap_sector_allocated_bytes should be the free space.

daniel-cesarini-tridonic-com commented 6 years ago

Thank you for the info about the repo. Another OFF-TOPIC question comes to my mind about the repos then: why do you use git subtree pull instead of the .lib mechanism of mbed?

daniel-cesarini-tridonic-com commented 6 years ago

So you would agree with such an implementation:

#define RESERVED_PERMANENT_ALLOC_SPACE 5000

void *ns_mem_temporary_alloc(ns_mem_book_t *heap, ns_mem_block_size_t alloc_size)
{
    if ( (heap->mem_stat_info_ptr->heap_sector_size - heap->mem_stat_info_ptr->heap_sector_allocated_bytes) < RESERVED_PERMANENT_ALLOC_SPACE)
    {
        return NULL;
    }
    else
    {
        return ns_mem_internal_alloc(heap, alloc_size, 1);
    }
}

kjbracey commented 6 years ago

That sort of implementation - I'd say the reserved permanent free space should probably default to a proportion of the total size. Maybe it should be configurable.

The amount probably doesn't need to be that high - the temporary space will "back away" as permanent allocations increase. The margin only needs to be enough to cover the expected all-at-once permanent allocation requirement.

My initial guess is something like 1K per 16K of heap.

daniel-cesarini-tridonic-com commented 6 years ago

Ok, of course that margin should be configurable.

What effects do you expect from fragmentation on such a mechanism?

kjbracey commented 6 years ago

I think fragmentation will continue to cause its own problems.

This may slightly reduce fragmentation issues by discouraging the temporary allocations from filling holes in the permanent end. (Not that there should be many holes there if the temporary/permanent split is doing its intended job.)

I guess a particularly bad temporary fragmentation might lead us to collide with the permanent end before hitting this space threshold, partially defeating the mechanism, with the permanent end taking holes among the temporaries.

That's not ideal. Conceivably you could have a second rule about keeping the "top" of the temporary end clear from the "top" of the permanent end. Not sure if it's worth the effort - I'd want to try this in isolation first.

daniel-cesarini-tridonic-com commented 6 years ago

That's not ideal. Conceivably you could have a second rule about keeping the "top" of the temporary end clear from the "top" of the permanent end. Not sure if it's worth the effort - I'd want to try this in isolation first.

What you mean by "I'd want to try this in isolation first."?

kjbracey commented 6 years ago

I mean test a patch that just does the free space limit before trying anything more elaborate.

daniel-cesarini-tridonic-com commented 6 years ago

We found another point of control of allocations (for ingress packets in the MAC layer), and created a corresponding PR in https://github.com/ARMmbed/sal-stack-nanostack-private/pull/1716

artokin commented 6 years ago

PR https://github.com/ARMmbed/nanostack-libservice/pull/70 introduces a new API that can be used to adjust free memory amount during temporary memory allocation.

MarceloSalazar commented 5 years ago

As agreed, we're closing this PR - Arm won't implement this.

PelionIoT / nanostack-libservice

Split nanostack heap into using 2 different books #67