Open ktf opened 1 year ago
The MetaHeader
part contains data that will be transferred to a different process, while other members of shmem Message
class are only valid/meaningful locally per process.
struct Message
{
Manager& fManager; // local ptr to the manager
bool fQueued; // is the message queued? (can the buffer removed when dtor is called).
MetaHeader fMeta; // contains all the "transferrable" info
size_t fAlignment;
mutable UnmanagedRegion* fRegionPtr; // used in case of unmanaged region messages
mutable char* fLocalPtr; // local data ptr, for received messages obtained from the handle
}
struct MetaHeader
{
size_t fSize;
size_t fHint; // this is used for unmanaged region messages to store user provided info, which is returned when message can be freed. Afaik this is used at least by FLP/readout.
boost::interprocess::managed_shared_memory::handle_t fHandle; // interprocess handle, convertable to local ptr
mutable boost::interprocess::managed_shared_memory::handle_t fShared; // used for the ref count when msg is shared
uint16_t fRegionId; // unmanaged region id
mutable uint16_t fSegmentId; // segment id
bool fManaged; // managed/unmanaged
};
Both should be minimal size, but I would say having the Metaheader
part smaller is more important.
fQueued boolean in the middle of it, introducing some unneeded extra padding. If I naively move it to the MetaHeader class, I immediately go from 96 bytes to 88
Logically it belongs to local data, not shared data. For this and similar suggestions one could think to create the final MetaData header only immediately before transfer, then there would be more freedom to resize Message
. I guess this line would have to be a lil bit more complicated then: https://github.com/FairRootGroup/FairMQ/blob/master/fairmq/shmem/Socket.h#L134
fAlignment at the moment is a size_t, while it could easily be a char storing log2 (assuming no one needs alignment which is not a power of 2).
Sounds good!
size and hint could be limited to 48 bits, leaving space for the above mentioned fAlignment, fQueued, fManaged to be stored there.
Would 48 be enough for current and future use cases? Hint is 64 bits as per requirement from FLP/readout.
it could probably be even less if one used handles for fManager and the the other pointers
Which handles do you mean? ptrs and boost interproccess handles are both 64, as they are supposed to be convertible into each other.
Both should be minimal size, but I would say having the MetaHeader part smaller is more important. Logically it belongs to local data, not shared data. For this and similar suggestions one could think to create the final MetaData header only immediately before transfer, then there would be more freedom to resize Message.
Ok, I understand the difference now. What you propose indeed makes sense to me.
Sounds good!
Indeed, if you do that, than it's enough to have fQueued shuffled after fAlignment.
Would 48 be enough for current and future use cases? Hint is 64 bits as per requirement from FLP/readout.
As far as I know, there is no x86 hardware which supports more than 48 bits of virtual address space. Maybe this should be architecture dependent? However Sylvain should probably comment here.
Which handles do you mean? ptrs and boost interproccess handles are both 64, as they are supposed to be convertible into each other.
What I mean is that right now you have:
Manager* fManager;
mutable UnmanagedRegion* fRegionPtr;
However you could, for example, decide that regions are known to a given Manager (i.e. the latter keeps a list of them) and change mutable UnmanagedRegion* fRegionPtr
to e.g. int8_t fRegionInManagerIndex
adding an indirection and giving to the manager the task to retrieve the actual region pointer. It would save you a pointer.
So how do we proceed here? Do you plan to attack this any time soon?
I will likely come to this sometime in September.
Ok, thank you!
The need to store the alignment value is for use in Message::SetUsedSize, which may need it in an edge-case.
Instead of storing it with each message, we could also accept it as a parameter to SetUsedSize.
What do you think?
Sounds reasonable to me. Indeed I could not find any invocation of SetUsedSize in the framework and only one in the old TOFCompressor, which indeed is known in advance so its not needed in any case.
Maybe you could determine minimum alignment from the current pointer (and cap it to, say, page boundaries).
basically:
size_t newAlignment = 1 << min(__builtin_ctz(oldPtr), 12);
Sounds good, then that could be done when no alignment is provided to SetUsedSize. That way no need to break the interface, alignment argument to SetUsedSize will be optional.
Then I would do this and not store alignment in shmem::Message at all.
Fully support this.
The current state in the dev branch is:
Manager& fManager;
mutable UnmanagedRegion* fRegionPtr = nullptr;
mutable char* fLocalPtr = nullptr;
size_t fSize = 0;
size_t fHint = 0;
boost::interprocess::managed_shared_memory::handle_t fHandle = -1;
mutable boost::interprocess::managed_shared_memory::handle_t fShared = -1;
uint16_t fRegionId = 0;
mutable uint16_t fSegmentId;
bool fManaged = true;
bool fQueued = false;
This (plus the fTransport
ptr from the parent and the this
ptr) brings the size to 80 bytes.
While doing some profiling I just noticed that the fair::mq::shmem::Message has a fQueued boolean in the middle of it, introducing some unneeded extra padding. If I naively move it to the MetaHeader class, I immediately go from 96 bytes to 88.
I also suspect one could gain quite some extra memory with a few extra reasonable limitations. For example:
All in all, I think one could get down below 64 bytes by having something like the following:
and it could probably be even less if one used handles for fManager and the the other pointers. Do you think there is any space to evolve the managed region to go into this direction?