Open to11mtm opened 10 months ago
Potentially another piece of savings:
I know it'd be a breaking API change - but do we really need Size
? Seems like that is maybe left over from an earlier iteration of the struct, not sure why the end-user needs to know the raw message Size. The Library can calculate it where needed, doesn't need to be on the NatsMsg<T>
I think
Potentially another piece of savings:
I know it'd be a breaking API change - but do we really need
Size
? Seems like that is maybe left over from an earlier iteration of the struct, not sure why the end-user needs to know the raw message Size. The Library can calculate it where needed, doesn't need to be on theNatsMsg<T>
I think
I can think of reasons but I'm not a normal end user 😅.
Getting rid of Size
would get us back 8 bytes for a reference T
NatsMsg<T>
. However, then we aren't as nicely aligned on 64B/128B cache lines; If we made such a change we'd likely want to investigate if keeping NatsConnection
as a direct field on the struct (since that appears to be provided in almost, if not every call to the Build method or constructor) would be better; especially because it would then, we would only allocate if ReplyTo
and Headers
are provided, and even then just 16B+objHeader.
I know it'd be a breaking API change - but do we really need
Size
?
I think this is the worst type of breaking change where you remove a feature. We might get away with it because it's early days.
I guess I can see the benefits of Size
in certain cases. But the current implementation only sets a Size on incoming messages. What about making a method that could compute it?
INatsConnection {
int GetMsgSize(NatsMsg<T> msg, INatsSerialize<T>? serializer)
}
As an optimization could maintain a look-up table of the last N (maybe 1000?) messages that were received on the connection, so calling GetMsgSize
on one received recently could return quickly
Interesting idea!
What about making a method that could compute it?
How? serialize previously deserialized message data? this is not always invertible.
optimization could maintain a look-up table
How do we identify a given message?
...just trying to understand 😃
Just a note on the message size: it's there because we needed to check the total size for pull requests when consuming.
Also, what about connection? Can't we have something like INatsConnection.ReplyTo(msg)
instead of carrying reference around?
How? serialize previously deserialized message data? this is not always invertible.
Yes, that's what we would have to do. True, we don't track the Deserializer (which may not even have complementing Serializer) so you'd either have to supply the matching Serializer that tyou used, or get the connection default
How do we identify a given message?
Wouldn't be trivial. We'd have to use some sort of look-up on a hash of the Data object or something, and we could easily have misses in that case.
Also, what about connection? Can't we have something like
INatsConnection.ReplyTo(msg)
instead of carrying reference around?
I guess we could, but then we'd lose the convenience of being able to do msg.ReplyTo
Just a note on the message size: it's there because we needed to check the total size for pull requests when consuming.
Maybe we should just leave it then, and note that it's only automatically set when a message is received. Same with Connection
.
FWIW, This is the current Object Layouts, based on the struct as of 5f9dc5a on x64:
Int32
Type layout for 'NatsMsg`1'
Size: 40 bytes. Paddings: 0 bytes (%0 of empty space)
|==============================================================|
| 0-7: String <Subject>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 8-15: String <ReplyTo>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 16-23: NatsHeaders <Headers>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 24-31: INatsConnection <Connection>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 32-35: Int32 <Size>k__BackingField (4 bytes) |
|--------------------------------------------------------------|
| 36-39: Int32 <Data>k__BackingField (4 bytes) |
|==============================================================|
String
(Or any other Reference or 8 byte struct)Type layout for 'NatsMsg`1'
Size: 48 bytes. Paddings: 4 bytes (%8 of empty space)
|==============================================================|
| 0-7: String <Subject>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 8-15: String <ReplyTo>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 16-23: NatsHeaders <Headers>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 24-31: String <Data>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 32-39: INatsConnection <Connection>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 40-43: Int32 <Size>k__BackingField (4 bytes) |
|--------------------------------------------------------------|
| 44-47: padding (4 bytes) |
|==============================================================|
NatsMemoryOwner<Byte>
Type layout for 'NatsMsg`1'
Size: 64 bytes. Paddings: 4 bytes (%6 of empty space)
|==============================================================|
| 0-7: String <Subject>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 8-15: String <ReplyTo>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 16-23: NatsHeaders <Headers>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 24-31: INatsConnection <Connection>k__BackingField (8 bytes) |
|--------------------------------------------------------------|
| 32-35: Int32 <Size>k__BackingField (4 bytes) |
|--------------------------------------------------------------|
| 36-39: padding (4 bytes) |
|--------------------------------------------------------------|
| 40-63: NatsMemoryOwner`1 <Data>k__BackingField (24 bytes) |
| |====================================| |
| | 0-7: ArrayPool`1 _pool (8 bytes) | |
| |------------------------------------| |
| | 8-15: Byte[] _array (8 bytes) | |
| |------------------------------------| |
| | 16-19: Int32 _start (4 bytes) | |
| |------------------------------------| |
| | 20-23: Int32 _length (4 bytes) | |
| |====================================| |
|==============================================================|
I did just have a possibly -terrifying- idea... Still need to check if this works with how Jetstream uses things but...
Idea being, We could use a Wrapper around ChannelReader to only provide the final NatsMsg<T>
including the connection on the actual read from the channel. Here's a simple POC of the Wrapper.
This wouldn't necessarily shrink the final size of NatsMsg<T>
, however it would let us use a smaller structure in our channel, which would help with the largest concerns as far as overall size and poor cache line alignment in the internal buffers.
Edit: Looks Like NatsJSMsg<T>
would work pretty well with this change, Possibly even better because the connection is already on the NatsJSContext
, we wouldn't even need to wrap the channel.
I guess if we take it a step further, we could also wait to call NatsMsg.Build
until the message is read off the channel, meaning de-serialization would happen at that point also. I think for status checking such as No Responders that we actually would want to de-serailize the Headers prior to putting in the channel, but that means we have to track size
also:
string subject,
string? replyTo,
NatsHeaders? headers,
ReadOnlySequence<byte> payloadData,
int size,
One benefit I could see here is that de-serialization would happen outside of the Read Processing Loop, so it could make more use of more cores.
I guess if we take it a step further, we could also wait to call
NatsMsg.Build
until the message is read off the channel, meaning de-serialization would happen at that point also. I think for status checking such as No Responders that we actually would want to de-serailize the Headers prior to putting in the channel, but that means we have to tracksize
also:string subject, string? replyTo, NatsHeaders? headers, ReadOnlySequence<byte> payloadData, int size,
One benefit I could see here is that de-serialization would happen outside of the Read Processing Loop, so it could make more use of more cores.
Two concerns with that:
ReadOnlySequence<byte>
is 24 bytes on x64 so we wind up with a larger struct.However I think I got some inspiration today... I'll update my gist over the weekend or just push a branch out for feedback.
Proposed change
In #276, the size of
NatsMsg<T>
came up as a concern for large buffer sizes. @mtmk asked the question:The zero-allocation case is worthy (As well as keeping our APIs stable,) so it may be worth investigating opportunities to save on the size of the struct, while only paying a (smaller) allocation penalty for additional features.
Essentially, what we can do is, rather than have separate
ReplyTo
,Headers
, andConnection
properties, use a singleprivate object _data
field that acts as a form of DU; If only one of the above is needed, it is placed directly in the field; if the additional fields are needed the appropriate container object(s) are allocated. The getters and initters of course will have to deal with the behavior of the DU (i.e. as-type-checking to get to the appropriate result), but that shaves off 16b from the struct (at a cost of up to 16-24b+header alloced for unhappy cases)Use case
The big win for the trouble is a 33% reduction in size of our struct (Based on showing 48 bytes for
NatsMsg<string>
via ObjectLayoutInspector). Our new struct size of 32b is a nice power of two as well, so we're also less likely to have to cross cache lines on writes (also, now 2 to a line instead of 1.? to a line) and reads toBoundedChannel<T>
's underlying buffer.The potential downsides:
Contribution
I am happy to contribute, including building a proof of concept branch if it will aid in helping decide whether this proposal is viable.