ChunkHeader should expose the size of the entire user payload section, including padding

gpalmer-latai commented 8 months ago

Brief feature description

When requesting custom alignment, it follows naturally that one would want to know the size of the allocated memory, including the padding. That way one could for example do a memory-aligned write to disk.

Detailed information

The layout of a chunk with a large user payload alignment is described here.

 sizeof(ChunkHeader)             back-offset   userPayloadSize
|------------------>|                  |<---|------------------->|
|                   |                  |    |                    |
+===================+=======================+====================+============+
|   Chunk-Header    |                  ¦    |    User-Payload    |  Padding   |
+===================+=======================+====================+============+
|                                           |                                 |
|          userPayloadOffset                |                                 |
|------------------------------------------>|                                 |
|                                                                             |
|                                 chunkSize                                   |
|---------------------------------------------------------------------------->|

The ChunkHeader API exposes:

A way to get the chunk header from the user payload
The total size of the chunk
The payload size

It does not expose:

The offset from the chunk header to the user header (this is private)
The size of the payload section including padding.

In order to calculcate this, one has to do:

  auto* chunk_header = iox::mepoo::ChunkHeader::fromUserPayload(payload);
  auto payload_offset = static_cast<uint64_t>(reinterpret_cast<uint8_t*>(payload) - reinterpret_cast<uint8_t*>(chunk_header));
  uint64_t total_payload_size_including_padding = chunk_header->chunkSize() - payload_offset;

which is unfortunate since we are recalculating a value which the chunk header already knows but does not wish to reveal, for some reason.

We can also calculate this value as payload_size + (payload_alignment - (payload_size % payload_alignment)). This works assuming Iceoryx has done the alignment properly. It still would be preferable to get that value directly from the chunk header to not risk miscalculating it on the application side.

elBoberido commented 8 months ago

What would be the benefit of having padding included? The padding bytes won't be initialized so accessing them is UB. What would be the use case of knowing this value?

gpalmer-latai commented 8 months ago

So we can for example perform direct writes: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/global_file_system/s1-manage-direct-io

We don't care about the data in the padding but the point of custom alignments is to be able to work with the data in sizes that are a multiple of that alignment. So we need to be able to access that "size-with-alignment-padding".

elBoberido commented 8 months ago

Maybe I misunderstand something but doesn't that just require the payload itself to be a multiple of the alignment?

Note, there are no guarantees regarding the padding. I can be 0 or 1GB. It is not even guaranteed to be the same on subsequent calls to loan. If you have this requirement, then you need to request a payload size which is a multiple of the alignment.

gpalmer-latai commented 8 months ago

The payload needs to be aligned, but does not need to have a size itself which is a multiple of the alignment. We do however need to provide a size to the API that is a multiple of the alignment, even if the size of the payload itself is smaller.

My understanding from chunk_header.md is that padding will be applied after the user payload to fulfill the alignment requirements. For example, if you have an alignment of 4096 and your user payload is 8000 bytes, then 192 bytes of padding will be added to make the total size with padding 8192 = 4096*2. This is the size I need exposed.

Right now I am recalculating the size by doing pointer arithmetic on the payload header compared to the chunk header, and then subtracting that from the chunk size. I've even added an IOX_ENSURES to verify that the calculated size is indeed a multiple of the alignment.

This is a whole lot of wasted arithmetic on every publisher loan. I'd rather just access the value directly as it has already been more or less calculated when creating the chunk header.

elBoberido commented 8 months ago

No, that's not the case. The padding is just the remaining bytes in the chunk but the payload is not increased to match a multiple of the alignment and it is also not enforced. If you have the requirement to get memory which is a multiple of the alignment you need to request it via the payload size. If you don't do it, the best case scenario would be that by accident the padding is large enough to fit the remaining bytes into the chunk. The not so good but as long as you don't write to the memory also not bad scenario would be to read data from the next chunk. The worst case scenario would be to get the last chunk of the last mempool and try to access memory which is not mapped to the address space which would result in a segfault.

gpalmer-latai commented 8 months ago

Got it. My misunderstanding then. Looks like I need to solve my problem via proper allocation of the chunk.

Thanks!

eclipse-iceoryx / iceoryx

ChunkHeader should expose the size of the entire user payload section, including padding #2203

Brief feature description

Detailed information