gBufCtx.b.aabBuffer[buf[][] allocated by aligned malloc()

bardeenlai18 commented 5 years ago

Hi, it is possible to make gBufCtx.b.aabBuffer[buf[REDCONF_BUFFER_COUNT] much more dynamic by run-time aligned malloc()?

danielrlewis commented 5 years ago

Hello, bardeenlai18, I'm one of the Reliance Edge developers and I'll answer your question.

Allocating the buffers with malloc() would conflict with our design goals.

Reliance Edge was designed to conform to the MISRA C:2012 guidelines. MISRA C:2012 Directive 4.12 states: "Dynamic memory allocation shall not be used." Furthermore, Rule 21.3 states: "The memory allocation and deallocation functions of shall not be used." Thus, Reliance Edge statically allocates all of its memory, including these buffers.

In addition, there are other advantages to static memory allocation. It eliminates run-time errors due to memory allocation failures. It is also more portable, since it allows Reliance Edge to be used with a freestanding implementation of C: implementations which lack C library routines like malloc().

If I may ask, what are you trying to accomplish by allocating the buffers with malloc()? You said you wanted to "make [the buffers] much more dynamic", but for what purpose? Can you elaborate on your use case and why it would benefit from dynamic allocation?

bardeenlai18 commented 5 years ago

Hi Daniel, Actually we are using STM32F7 series MCU for our application. We are facing the D-Cache issue for SD card Block Device DMA transfer. It would be an issue if the block buffers are statically allocated in the region of DTCM. However, it would be an issue for DMA D-cache maintenance if these block buffers are no matter allocated statically in SRAM or malloc() not aligned 32-bytes.

I have to modify buffer.c in order to cope with this issue:

/** Buffer heads, storing metadata for each buffer.
*/
BUFFERHEAD  aHead[REDCONF_BUFFER_COUNT];

/** Array of memory for the block buffers themselves.

    Force 64-bit alignment of the aabBuffer array to ensure that it is safe
    to cast buffer pointers to node structure pointers.
*/
#ifdef __linux__
ALIGNED_2D_BYTE_ARRAY(b, aabBuffer, REDCONF_BUFFER_COUNT, REDCONF_BLOCK_SIZE);
#else
union {
        uint8_t *aabBuffer[REDCONF_BUFFER_COUNT];
} b;
//uint8_t     *aabBuffer[REDCONF_BUFFER_COUNT];
#endif
} BUFFERCTX;

Also, I have to insert and modify the code before RedBufferInit():

// allocate a big pool!
int init_RedBuffer(int mf)
{
static uint8_t *pool;
uint8_t bIdx;

// free buffer
if (mf == 0) {
    vfs_free(pool);
    return 0;
}

// make 32-bytes aligned buffer
pool = vfs_malloc(REDCONF_BUFFER_COUNT * REDCONF_BLOCK_SIZE + 32);
if (pool == NULL) {
    return -RED_ENOMEM;
}
RedMemSet(pool, 0, REDCONF_BUFFER_COUNT * REDCONF_BLOCK_SIZE + 32);

// split the big pool for the block buffer
uint32_t addr = ((uint32_t)pool + 32) & (~0x1F);
for (bIdx = 0U; bIdx < REDCONF_BUFFER_COUNT; bIdx++) {
    gBufCtx.b.aabBuffer[bIdx] = (uint8_t *)addr;
    addr = addr + REDCONF_BLOCK_SIZE;
}
return 0;
}

I believe redfse.h version can handle some MCU with limited resource but for standard version redfs.h, you could consider to make it more dynamic.

Also, sometimes we don't know how many files should be opened for operation at run-time. The only solution we have to do is to pre-define maximum file count. However, it consumes RAM if set too many. Besides, it will consume much more memory if Block Size is set to 4096 (even higher) for performance concerns.

bardeenlai18 commented 5 years ago

sorry my mistake to "Close and comment"

danielrlewis commented 5 years ago

Actually we are using STM32F7 series MCU for our application. We are facing the D-Cache issue for SD card Block Device DMA transfer. It would be an issue if the block buffers are statically allocated in the region of DTCM. However, it would be an issue for DMA D-cache maintenance if these block buffers are no matter allocated statically in SRAM or malloc() not aligned 32-bytes.

So, if I understand correctly, for DMA purposes the block buffers need to: a) be aligned on a 32-byte address; and b) reside in a particular memory region.

It would be possible to modify the code to provide 32-byte buffer alignment while sticking with static allocation. That would involve statically allocating a slightly larger pool and, at run-time, finding the first 32-byte aligned address within that pool to be used as the base address of the block buffers.

However, that wouldn't resolve the memory region issue. One of my colleagues has recently been working with a STM32F7 board, and is familiar with the memory region restrictions of the DMA controller(s). Perhaps there is a method to force certain static memory allocations to be located in a particular memory region. Unfortunately, that colleague is on vacation this week. When he gets back, I'll ask him to look at this issue.

I have to modify buffer.c in order to cope with this issue: [snip]

Also, I have to insert and modify the code before RedBufferInit(): [snip]

Those changes look correct for what you're trying to do. It's fine to use a version of Reliance Edge which has been modified to meet your needs; other users have done so before. Just remember that if you distribute the project, you need to comply with the GPLv2 license (or buy a commercial license from Datalight).

I believe redfse.h version can handle some MCU with limited resource but for standard version redfs.h, you could consider to make it more dynamic.

I agree that a more dynamic version of Reliance Edge would have utility in environments with more resources. Nonetheless, some of our users (including those using the POSIX-like API) like that we statically allocate all the memory and wouldn't be happy if we changed that. Thus, any introduction of dynamic allocation would need to be optional (e.g., conditioned on configuration macros or implemented on a branch). Also, aside from the block buffers, Reliance Edge uses static allocation for other things (like volume structures and handles); changing all of that to use dynamic memory allocation would be no small effort.

Also, sometimes we don't know how many files should be opened for operation at run-time. The only solution we have to do is to pre-define maximum file count. However, it consumes RAM if set too many. Besides, it will consume much more memory if Block Size is set to 4096 (even higher) for performance concerns.

Users are intended to configure the handle count (and inode count) to a pre-defined maximum, so that's nothing unusual.

The number of files and directories that can be opened is limited by REDCONF_HANDLE_COUNT, which is independent of the buffer count. Each handle does require memory (the REDHANDLE structure in posix/posix.c), but with the right configuration options, the handles can be quite small. If REDCONF_API_POSIX_READDIR is zero, each handle is only 16 bytes; with readdir enabled, it depends on the maximum file name length.

The buffer count (REDCONF_BUFFER_COUNT) does not necessarily need to be increased when the handle count is increased. It is possible, for example, to have 256 open files with 12 buffers. Increasing the buffer count might improve performance with many files, but it isn't required.

The number of files and directories that can be created is limited by the inode count. Increasing the inode count does not use any additional RAM.

FYI, starting tomorrow (August 8th) I will be on vacation for several days. Much of the rest of the team is on vacation this week, so we might not respond to this issue again until next week. So even if we go quiet for a few days, that doesn't mean we are ignoring you. :)

bardeenlai18 commented 5 years ago

Hi Daniel, thank you so much for your explanation. Actually we are working with your sales team Bill to get your commercial license! Your company deserved our support to maintain the beauty of the Open Source world !

tuxera / reliance-edge

gBufCtx.b.aabBuffer[buf[][] allocated by aligned malloc() #12