Open Rjvs opened 3 years ago
Yeah the generated code uses Array.Sort
when building collections and a binary search when resolving keys; it should probably be replaced with a Dictionary<T>
. It also calls System.Text.Encoding.UTF8.GetBytes
for strings when looking them up. It's not great.
The implementation generated appears to be an order of magnitude slower than any other dictionary/map implementation. It'd be better to use just Dictionary<T>
with a generated Comparer<T>
that properly uses the Offset
s and ArraySegment<byte>
s or something.
The existing implementation in generated code is CreateSortedVectorOf
; you have to do the tooling around the keyed types yourself anyway. Using a dictionary would even be less code.
I think this is cruft translated over from the C/C++ side that is mirrored across all the language generators. It's not the best solution for C#, Go, or any other managed language. Big oof.
Even map
/unordered_map
for C++ would do better here.
I think we drop it from code generation and create comparers for keyed types that can be used for C#, analogs for C++ and Go. Maybe keep it for C? Not sure.
If we retool it to create generated comparers, it wouldn't need to compare entire structure as it would with other models, it would only compare the keys; the language libraries can deal with that properly when throwing them into a collection. For non-keyed types, we'd do full structural comparers.
The reason there's no single shared collection for keyed models is pretty obvious to me; could have multiple contexts and you need a way to namespace them, but even namespaced collections would be more overhead than letting the consumer manage the collections, sometimes they'll namespace, sometimes they'll singleton it. It's also more code to implement, (slightly) less performant, there are data isolation hazards, lots of other problems.
EF and similar data model frameworks have you create models of the entire schema and bind to that, that becomes your isolation context; the collections on the schema model are the tables.
We could do that for BigBuffers, but it's probably better to just do it manually for now.
Ben should definitely follow the C# convention of creating Comparer<T>
s for the models, especially for types we have hash keys for.
We should probably use the backing store for comparing the UTF8 bytes, Span<byte>
s, SequenceEqual
, implement GetHashCode
, and support comparing against native String
types too... .NET uses XxHash32 now for Strings and other types I think. Could read the UTF8 bytes as Rune
s (like StringRuneEnumerator
) to create hash codes that match between native String
and UTF8 byte arrays... Would be able to push that work into a shared into BigBuffers or into a separate library, or create the Comparer
in Utilities.Net.
IEquatable<> and IComparable<> are implemented for structs and tables with keys. 😄
As Ben says: