Open jgoz opened 12 years ago
After some thought, the current implementation is probably sufficient for a general-case baseline given that the encoded type names are being cached.
What we may want to do is make this functionality pluggable so that client applications can use their own Type header format. For maximum performance (and with considerable effort), clients could specify type IDs at compile time and potentially reduce the type header to a constant 4 bytes instead of 50-80 bytes.
Yes, we were aiming for an easy out-of-the-box experience with no configuration required but allow people to override things when optimizing.
BTW: I think the FullName would be a good compromise instead of the AssemblyQualifiedName. Shorter and avoids some potential assembly-versioning issues (or creates different ones!).
Oops, didn't read the full code :)
I know I'm just benchmarking in my head but in my experience doing direct string manipulation (e.g. split, join) will be much faster than using a Regex (even if it were compiled).
I'll do some proper testing and update it tonight if it's significant
You're probably right about Regex being slower, but this will only happen if the message type hasn't been seen before. Hopefully, client apps will not have millions of message types...
Ok, I really, really need to read the code don't I?! :)
The reason that the AssemblyQualifiedName is being used is to ensure that when Type.GetType(name) is called it can correctly locate the type in whatever assembly it may exist. The Regex is being used to strip out the version information. We could arguably drop the PublicKeyToken etc, but at the very least we need the assembly name I think?
The 'Type' frame is cached for reuse, so a given type header will only ever exist once and should not need to be rebuilt. The receiving end also caches the Type once found so it never needs to be looked up again.
I would not expect to have millions of message types, heck hundreds of message types is highly unlikely.
As for making the Type header format pluggable, that is a good idea; and would tie in issue #9 of having 'keyed' headers to identify the key/value pair.
The current Type header format in
MessageWriter
should be profiled and improved if possible/necessary.The performance impact of using a versionless AQN may not matter with large messages, but it could become an issue for small messages, especially if a compact binary serializer is used (i.e., protobuf).
Required/desirable properties for Type encoding: