ARMmbed / core-util

DEPRECATED: Mbed 3 utilities library
Other
12 stars 17 forks source link

Array: Possible optimisations and code improvements #69

Open adbridge opened 8 years ago

adbridge commented 8 years ago

There are a number of possible improvements :

  1. Make the majority of the init parameters into template parameters
  2. @bremoran (to add more he has already found)
bogdanm commented 8 years ago

I don't see how 1 above would help or what it would fix. More details please?

adbridge commented 8 years ago

I'll defer to @bremoran on number 1 as it was his suggestion ...

ciarmcom commented 8 years ago

ARM Internal Ref: IOTSFW-2000

adbridge commented 8 years ago
  1. Add the ability for the array to shrink. Currently when the last element in a zone is deleted the zone remains in scope even though it is no longer being used.
bremoran commented 8 years ago
  1. I don't understand why it cannot be a construction argument. Doing otherwise leads to confusing access scenarios where an Array has not been initialized.

    Unless the size of initial_capacity, grow_capacity, alloc_traits, or alignment is to be determined at runtime, the signature of Array should be:

    template <typename T, size_t InitialCapacity, size_t GrowCapacity, UAllocTraits_t AllocTraits, size_t Alignment = MBED_UTIL_POOL_ALLOC_DEFAULT_ALIGN>
    class Array;

    Even if these parameters are expected to be determined at runtime, they could be placed in the constructor.

  2. The size of array_link can be reduced by one pointer just by changing the structure:

    struct array_link {
       array_link(void *_data, unsigned _first_idx, array_link *_prev):
           data((uint8_t*)_data),
           first_idx(_first_idx),
           prev(_prev) {
       }
       unsigned first_idx;
       array_link *prev;
       T data[0];
    };

    This also guarantees correct alignment of the data elements. It also simplifies the element accessor:

    T *get_element_address(unsigned idx) const {
       array_link *crt = _head;
    
       CORE_UTIL_ASSERT(idx < _elements);
       // first_idx in the first allocated array is 0, so the loop below is guaranteed to end
       while(idx < crt->first_idx) {
           crt = crt->prev;
       }
       CORE_UTIL_ASSERT(crt != NULL);
       return crt->data[idx - crt->first_idx];
    }
  3. Error checking consistency. check_access is only called from the at() API, but it is necessary for operator [](), push_back(), pop_back(), get_num_elements(), and get_capacity(). The dependency in these last two could be removed by initializing them in the constructor. All of the dependencies could be removed by replacing the init API with a constructor that initializes Array instead of requiring the call to init().
bremoran commented 8 years ago

The alignment calculation on line 83 is unnecessary.

Instead of:

_element_size = PoolAllocator::align_up(sizeof(T), alignment);

We could use:

_element_size = PoolAllocator::align_up(sizeof(T), __alignof__(T));

This would eliminate the need for the alignment argument. At the very least, the alignment argument should be replaced with:

bool init(size_t initial_capacity, size_t grow_capacity, UAllocTraits_t alloc_traits, unsigned alignment = __alignof__(T))

The current status would see every array with an element size of N*MBED_UTIL_POOL_ALLOC_DEFAULT_ALIGN+x and an alignment of 1 waste MBED_UTIL_POOL_ALLOC_DEFAULT_ALIGN-x bytes per element.

bogdanm commented 8 years ago
  1. As I've explained a number of times in the past, the problem is that init allocates memory, an operation that could fail. There's no way to signal that failure from the constructor, this is why this double step initialization (construct + init) was chosen. I still believe this holds, unless someone makes me aware of a better alternative. Of course, even before init, the initial capacity and the rest of the internal members should be initialized to 0, this is indeed a bug.
  2. Again, that's an implementation optimization and I don't see its actual advantages.
  3. The difference between operator [] and at is efficiency, this is why the comment on operator [] states that calling it with an invalid index results in undefined behavior. The idea was to mirror the behaviour of std::array:

http://en.cppreference.com/w/cpp/container/array/operator_at http://en.cppreference.com/w/cpp/container/array/at

That aside, I don't have anything against adding bound checks to operator []

bremoran commented 8 years ago
  1. You just need an API that checks _head. This makes array usable in statically initialized code. The quantity of code is actually smaller since you don't need to call a function to test the result of initialization:

    Array::operator bool() {return _head != nullptr;}
    // Now:
    Array a;
    int rc = a.init(...);
    if(rc) {
        // die
    }
    // Construction init
    Array a(...)
    if (!a) {
       // die
    }

    If you still think it's better to have the init call, that's fine, but it's completely possible to get an error status out of the object on allocation failure. What you don't get with construction-time allocation is the option to retry if allocation fails. Is that a common use-case for the initial construction of an Array? I would think that in statically allocated cases, it's a hard failure to fail init, but in dynamically allocated cases, the Array could be freed and allocated again at a later time, so init-retry might not be relevant.

  2. It saves 4 bytes per array-link, it's more readable, it guarantees alignment, and it's faster (one fewer pointer dereferences). You talk later about [] being faster than at(). Why is optimization okay in one spot but not another?
  3. Point taken on operator[]() but not the other APIs. Only operator[]() has an API note about bounds checking. push_back(), pop_back(), get_num_elements(), and get_capacity() still need either documentation or error checking.
bogdanm commented 8 years ago
  1. init-retry is probably not very relevant indeed. I can see how this would work. While I'm slightly inclined towards the init() version, I wouldn't mind changing this (and the "static initialization" argument is indeed compelling).
  2. The thing is that I'd still like people to be able to specify their weird non-standard alignment if they wanted to, and making T data[0] makes that harder. The context of [] vs at() was not optimization, but mirroring std::array, the optimization is a side effect of that.
  3. push_back already returns false in case of error. pop_back() fails silently if the array is empty, but shouldn't break (adding a bool return makes sense anyway). I don't see what kind of bounds checking I could do on get_num_elements and get_capacity.
bremoran commented 8 years ago

Do we have an actual use-case for non-standard alignment? I've seen required alignment != __alignof__(object) in image processing, but not in MCUs.

get_num_elements and get_capacity return garbage values if uninitialized.

bremoran commented 8 years ago

Even if you want non-standard alignment, we can still do:

    struct array_link {
        array_link(void *_data, unsigned _first_idx, array_link *_prev):
            data((uint8_t*)_data),
            first_idx(_first_idx),
            prev(_prev) {
        }

        unsigned first_idx;
        array_link *prev;
        uint8_t data[0];
    };

    T *get_element_address(unsigned idx) const {
        array_link *crt = _head;

        CORE_UTIL_ASSERT(idx < _elements);
        // first_idx in the first allocated array is 0, so the loop below is guaranteed to end
        while(idx < crt->first_idx) {
            crt = crt->prev;
        }
        CORE_UTIL_ASSERT(crt != NULL);
        return (T*)((uintptr_t)crt->data + _element_size * (idx - crt->first_idx));
    }

This gains the speed, size, and alignment benefits, but not the readability benefits. It's also slightly slower since _element_size cannot be inlined. (it could if it were a template parameter)

bogdanm commented 8 years ago

Do we have an actual use-case for non-standard alignment? I

I worked on this at about the same time when I looked at the K64F lwIP driver, and that required some unusual memory alignment for the K64F DMA (16 bytes and 8 bytes IIRC). This situation will probably repeat itself, but it might not have a lot to do with Array and its friends. I don't think I have anything against removing the alignment option completely for now and coming back to it in the future if needed.

get_num_elements and get_capacity return garbage values if uninitialized.

Ah sure, but this doesn't have anything to do with bounds checking, it's just that I completely forgot to initialize them to 0 in the constructor :sweat_smile:

bremoran commented 8 years ago

Okay, well, if we're all on the same page, then I think we should get some of these updates in.

bogdanm commented 8 years ago

Sure. This will require a new major of core-util and a few updates around that in various modules, so it's a non trivial amount of work. In fact, there are two things to do:

  1. fix the current implementation without changing the API (taking care mostly of missing initializations).
  2. implement the new API without init and without explicit alignment.

1 above is not hard to do and shouldn't require a new major, so we should definitely do it.

bremoran commented 8 years ago

There's no reason we can't add a new constructor, and deprecate the old one and init. The change to the structure of array_link will cause no external headaches.

bogdanm commented 8 years ago

I was thinking about completely removing alignment, which sounds like a new API.

bremoran commented 8 years ago

The new constructor doesn't need to accept alignment which would allow us to introduce the new API in a non-breaking way.

bogdanm commented 8 years ago

I get that, I was thinking about removing it completely to simplify things. But that will work too.

adbridge commented 8 years ago

Or it could still accept it but ignore it , along with a document change to state that? That would be an easy way to deprecate?

bogdanm commented 8 years ago

"accept and ignore" is worse than "don't accept" IMO. You're basically asking for trouble. With "don't accept", you get an error early (compile time) and you're done with it. With "accept and ignore" you basically get undefined behaviour.

bremoran commented 8 years ago

"accept and deprecate" :)

bremoran commented 8 years ago

@adbridge The solution is here: https://github.com/ARMmbed/compiler-polyfill/blob/master/compiler-polyfill/attributes.h#L42