kcat / openal-soft

OpenAL Soft is a software implementation of the OpenAL 3D audio API.
Other
2.21k stars 534 forks source link

Memory override #79

Open septag opened 8 years ago

septag commented 8 years ago

Hi, I'm planning to override memory functions. What do you suggest that I should use for naming convention and api placement ? in a case that you may merge it to your source. Should I add an extension and place the API in alext.h. something like alSetMallocCallbacksSOFT ?

kcat commented 8 years ago

I'm honestly not sure. The main problem with memory function overrides is the needed functionality may change over time. For instance, originally it only needed standard malloc (and calloc), realloc, and free. Some spots still use those, but many have been replaced by the internal al_malloc (and al_calloc) and al_free calls, which use C11's aligned_alloc and free when possible, or other system-dependent functions as a fallback, and requires different function signatures and behavior.

Eventually, I intend to also utilize locking functions like mlock and munlock, which need careful memory management, and I'm not sure exactly how I'm going to go about it yet. I'm actually giving thought to using preallocated chunks of memory that can be safely locked and doing my own management for it, which could change the requirements again for the actual allocation functions.

septag commented 8 years ago

For aligned_alloc, calloc, and memory functions with different signatures you can pass them all through generic malloc, realloc and free callbacks, and write your own memory alignment function (you can ditch system functions like aligned_alloc/posix_memalign/etc.) . And define some macros to make them look nicer:

#define AL_ALLOC(_allocator, _size)  al_alloc(_allocator, _size, 0, __FILE__, __LINE__)
#define AL_CALLOC(_allocator, _size)  al_calloc(_allocator, _size, 0, __FILE__, __LINE__)
#define AL_REALLOC(_allocator, _ptr, _size) al_realloc(_allocator, _ptr, _size, 0, __FILE__, __LINE__)
#define AL_FREE(_allocator, _ptr)  al_free(_allocator, _ptr, 0, __FILE__, __LINE__)
#define AL_ALIGNED_ALLOC(_allocator, _size, _align) al_alloc(_allocator, _size, _align, __FILE__, __LINE__)
#define AL_ALIGNED_CALLOC(_allocator, _size, _align) al_calloc(_allocator, _size, _align, __FILE__, __LINE__)
#define AL_ALIGNED_REALLOC(_allocator, _ptr, _size, _align) al_realloc(_allocator, _ptr, _size, _align, __FILE__, __LINE__)
#define AL_ALIGNED_FREE(_allocator, _ptr, _align) al_free(_allocator, _ptr, _align, __FILE__, __LINE__)

which eventually they can all be directed through a generic malloc / free callbacks. You can even have only 1 callback instead of malloc/free/realloc:

void* my_realloc(void* _ptr, size_t _size, size_t _align, const char* _file, uint32_t _line)
{
    if (_size == 0)
        free(_ptr);
    else if (_ptr == NULL)
        return malloc(_size);
    else if (_ptr)
        return realloc(_ptr, _size);
}

For some objects like alSources and alBuffers, pools and some other custom memory allocators are definitely helpful for performance and cache locality.

Another suggestion, is that every object keep an structure of malloc/free callbacks for itself, so the caller could for example set an allocator for initilization, an allocator for buffers and allocator for other stuff separately to the context, and when we call glGen*\ functions, the library use that specific allocator and save it in the object for future Delete. I know this a bit unconventional, but gives the programs/engines that use AL a great flexibility in terms of memory management. so they can override memory management of every part of the library. For example some engines can preallocate big chunks of memory for sound buffers, use extremely fast linear allocators, and set it to AL for some level they are loading, after the level is finished they can free them all-together in one call. That's what I really like to do when using the AL library. Something like this:

alOverrideMemorySOFT(AL_MEMORY_BUFFERS, &myFastLinearAllocator);
alOverrideMemorySOFT(AL_MEMORY_SOURCES, &myPoolAllocator);
// Load a bunch of sounds
alGenBuffers(...);
...
// Unload
alDeleteBuffers(...);
alDeleteSources(..);
myFastLinearAllocator.reset();
myPoolAllocator.reset();

If you don't mind I can give it a try and make all allocations/deallocations overriable inside the library.
Then I can use fast indexed pools for some objects like alSource that only swaps indices on free/alloc.

kcat commented 8 years ago

For aligned_alloc, calloc, and memory functions with different signatures you can pass them all through generic malloc, realloc and free callbacks, and write your own memory alignment function (you can ditch system functions like aligned_alloc/posix_memalign/etc.) .

That's not actually portable. C only stipulates that (int)NULL == 0, and doesn't guarantee that a pointer-to-int conversion maintains address alignment. That is, (((intptr_t)some_ptr)&0xf) == 0 doesn't guarantee some_ptr is a 16-byte aligned address. OpenAL Soft only falls back to that assumption when there's no other option.

How many systems actually behave that way, I don't know, but that's part of why C11 added aligned_alloc instead of just suggesting to offset the pointer when stronger alignment is needed.

I'll need to look and ask around to see if there's any suggestions on how to allow user-specified allocators in a way that can remain reasonably flexible for future needs, or at least, handle it in a way that poses the least risk of breaking apps when something new is needed.

septag commented 8 years ago

didn't know about that, but how about using unions ? I use this code for pointer alignment and haven't had any portability issues:

    #define ALIGN_MASK(_value, _mask) ( ( (_value)+(_mask) ) & ( (~0)&(~(_mask) ) ) )
    inline void* alignPtr(void* _ptr, size_t _extra, size_t _align)
    {
        union { void* ptr; size_t addr; } un;
        un.ptr = _ptr;
        size_t unaligned = un.addr + _extra; // space for header
        size_t mask = _align-1;
        size_t aligned = ALIGN_MASK(unaligned, mask);
        un.addr = aligned;
        return un.ptr;
    }

_extra is the header that I put the alignment offset value and is 1 byte (sizeof(uint8_t))

    size_t total = _size + _alignment;
    uint8_t* ptr = (uint8_t*)alloc(_allocator, total, 0, _file, _line);
    uint8_t* aligned = (uint8_t*)alignPtr(ptr, sizeof(uint8_t), _alignment);            
    uint8_t* header = aligned - 1;
    *header = uint8_t(aligned - ptr);
kcat commented 8 years ago

I don't think that makes it any more portable. It'll work for most systems, like x86, x86-64, and ARM, but since it's not guaranteed by the standard or compiler, there's no saying where it won't align properly. Depending on your target it may be good enough, but OpenAL Soft tries to be as portable as reasonably possible and tries to keep backwards compatibility.