jamescourtney / FlatSharp

Fast, idiomatic C# implementation of Flatbuffers
Apache License 2.0
511 stars 51 forks source link

Mapping scalar types to common CLR types (e.g DateTime) #86

Closed mausworks closed 3 years ago

mausworks commented 3 years ago

First of all, thanks for a really nice library! We're currently in the process of replacing our proprietary serializer with FlatBuffers and this library is really helpful.

I'm curious if there is any simple way to convert a scalar type in FlatBuffers to a common C# CLR type. This was something which our previous solution handled quite nicely.

In my scenario I want to be able to automatically parse DateTime and DateTimeOffset from a long (a unix timestamp in milliseconds).

Basically, what I want to do is have a custom converter from long to DateTime. So when the parser runs across a long which has the target CLR-type of DateTime or DateTimeOffset, I want to make a quick conversion from a UNIX timestamp, and when the serializer runs across a DateTime, I simply want to output a UNIX timestamp.

I've tried to implement an ITypeModel called DateTimeTypeModel but that interface is really cumbersome to work with. The RuntimeTypeModel has an internal constructor, so you can't inherit it, but it would help easing implementation.

Any advice on how to do this?

jamescourtney commented 3 years ago

The reason I don't include any "sugar" types in FlatSharp is because the FlatBuffer format is still evolving, and I don't want to end up in a situation where I've done some custom implementation for something that ends up conflicting with a later standardized implementation in FlatBuffers proper. This is why I don't have dictionaries, timestamps, or other fun things! I did include ITypeModel as a way to extend FlatSharp, but you've discovered it's not for the faint of heart.

Anyway, here are my thoughts on how you can accomplish this.

Use protected properties

[FlatBufferTable]
public class MyNeatTable
{
    public DateTimeOffset Timestamp
    {
        get => DateTimeOffset.FromUnixTimeSeconds(this.RawTimestamp);
        set => this.RawTimestamp = value.ToUnixTimeSeconds();
    }

   [FlatBufferItem(0)]
   protected virtual long RawTimestamp { get; set; }
}

This approach isn't ideal because it forces you to do this indirection trick everywhere you want to use a timestamp. This also won't work if you're using an FBS file.

A timestamp class with operator overloads

[FlatBufferStruct]
public class MyTimestamp
{
      [FlatBufferItem(0)]
      public virtual long Value { get; set; }

      public static implicit operator DateTimeOffset(this MyTimestamp ts) => DateTimeOffset.FromUnixTimeMilliseconds(ts.Value);
      public static implicit operator MyTimestamp(this DatetimeOffset dto) => new MyTimestamp { Value = dto.ToUnixTimeMilliseconds() };
}

This one will work in C# or using .fbs files, since FlatSharp generates partial classes for you. It will also allow you to write your code fluently because the operators are implicit.

Implementing ITypeModel / ITypeModelProvider

It's not as hard as it sounds, but it requires a little working knowledge of how FlatSharp operates internally. It's harder than it might need to be (perhaps I should come up with a way to alias certain types with well-known conversions). Anyway, here is a gist with some working code to accomplish this: https://gist.github.com/jamescourtney/5520f91f7bbb142301dafd7386eb5f39. The short version is that it works by wrapping the predefined LongTypeModel, and just modifying the inputs and outputs to the various serialization / get max size methods.

You could conceivably integrate this with the Flatsharp compiler, though I have not done that. Mainly because I assume anyone using an FBS file cares about cross-language compatibility, which would defeat the point of an extension.

jamescourtney commented 3 years ago

Keep in mind that that gist is something I threw together pretty quickly -- it does generate valid C# code and will work, but it's not production-ready.

mausworks commented 3 years ago

Thanks a lot for your replies! I'll try to implement some type models based on your gist.

I am however curious to why code generation is used in this project. Our previous binary serializer (which was inspired by FlatBuffers) could do without it.

It would be nice if you could simply have a TypeModel.GetValue(FlatBufferReader) and TypeModel.SetValue(TValue, FlatBufferWriter) or something similar. I haven't looked deep into the internals of this library (yet), but this would be dead simple to implement and extend for virtually any type, and creating generic type models would likely be possible (e.g. NumericTypeModel<long>) which could replace the .tt-file for these conversions.

Just throwing it out there. :+1:

mausworks commented 3 years ago

So, given that I'm only interested in the value conversion and the CLR type, I implemented the following classes.

The type model converter ```csharp public abstract class TypeModelConverter : ITypeModel { protected ITypeModel BaseModel { get; } public FlatBufferSchemaType SchemaType => BaseModel.SchemaType; public Type ClrType => typeof(TConverted); public ImmutableArray PhysicalLayout => BaseModel.PhysicalLayout; public bool IsFixedSize => BaseModel.IsFixedSize; public bool IsValidStructMember => BaseModel.IsValidStructMember; public bool IsValidTableMember => BaseModel.IsValidTableMember; public bool IsValidVectorMember => BaseModel.IsValidVectorMember; public bool IsValidUnionMember => BaseModel.IsValidUnionMember; public bool IsValidSortedVectorKey => BaseModel.IsValidSortedVectorKey; public int MaxInlineSize => BaseModel.MaxInlineSize; public bool MustAlwaysSerialize => BaseModel.MustAlwaysSerialize; public bool SerializesInline => BaseModel.SerializesInline; protected TypeModelConverter(ITypeModel baseModel) => BaseModel = baseModel; public abstract CodeGeneratedMethod CreateGetMaxSizeMethodBody(GetMaxSizeCodeGenContext context); public abstract CodeGeneratedMethod CreateParseMethodBody(ParserCodeGenContext context); public abstract CodeGeneratedMethod CreateSerializeMethodBody(SerializationCodeGenContext context); public void Initialize() => BaseModel.Initialize(); public TableMemberModel AdjustTableMember(TableMemberModel source) => BaseModel.AdjustTableMember(source); public string GetNonNullConditionExpression(string itemVariableName) => BaseModel.GetNonNullConditionExpression(itemVariableName); public string GetThrowIfNullInvocation(string itemVariableName) => BaseModel.GetThrowIfNullInvocation(itemVariableName); public void TraverseObjectGraph(HashSet seenTypes) { seenTypes.Add(typeof(TConverted)); BaseModel.TraverseObjectGraph(seenTypes); } public bool TryFormatDefaultValueAsLiteral(object defaultValue, out string? literal) => BaseModel.TryFormatDefaultValueAsLiteral(defaultValue, out literal); public bool TryFormatStringAsLiteral(string value, out string? literal) => BaseModel.TryFormatDefaultValueAsLiteral(value, out literal); public bool TryGetSpanComparerType(out Type comparerType) => BaseModel.TryGetSpanComparerType(out comparerType); public bool TryGetTableKeyMember(out TableMemberModel? tableMember) => BaseModel.TryGetTableKeyMember(out tableMember); public bool TryGetUnderlyingVectorType(out ITypeModel? typeModel) => BaseModel.TryGetUnderlyingVectorType(out typeModel); public bool ValidateDefaultValue(object defaultValue) => BaseModel.ValidateDefaultValue(defaultValue); } ```
The type model converter provider ```csharp public class TypeModelConverterProvider : ITypeModelProvider where TConverter : ITypeModel, new() { private string? _alias; private readonly Lazy _converter = new Lazy(() => new TConverter()); public TypeModelConverterProvider(string? alias = null) => _alias = alias; public bool TryCreateTypeModel( TypeModelContainer container, Type type, out ITypeModel? typeModel) { if (type == _converter.Value.ClrType) { typeModel = _converter.Value; return true; } typeModel = null; return false; } public bool TryResolveFbsAlias( TypeModelContainer container, string alias, out ITypeModel? typeModel) { _alias ??= _converter.Value.ClrType.Name; if (alias.Equals(_alias, StringComparison.Ordinal)) { typeModel = _converter.Value; return true; } typeModel = null; return false; } } ```
Implementation ```csharp public class DateTimeOffsetConverter : TypeModelConverter { public DateTimeOffsetConverter() : base(new LongTypeModel()) { } public override CodeGeneratedMethod CreateGetMaxSizeMethodBody(GetMaxSizeCodeGenContext context) { var value = $"{context.ValueVariableName}.ToUnixTimeMilliseconds()"; return BaseModel.CreateGetMaxSizeMethodBody(context.With(value)); } public override CodeGeneratedMethod CreateParseMethodBody(ParserCodeGenContext context) { return new CodeGeneratedMethod { MethodBody = $"return DateTimeOffset.FromUnixTimeMilliseconds({context.GetParseInvocation(typeof(long))});" }; } public override CodeGeneratedMethod CreateSerializeMethodBody(SerializationCodeGenContext context) { var value = $"{context.ValueVariableName}.ToUnixTimeMilliseconds()"; return BaseModel.CreateSerializeMethodBody(context.With(valueVariableName: value)); } } ``` ```csharp public static FlatBufferSerializer CreateSerializer() { var typeModels = TypeModelContainer.CreateDefault(); typeModels.RegisterProvider(new TypeModelConverterProvider()); typeModels.RegisterProvider(new TypeModelConverterProvider()); return new FlatBufferSerializer( new FlatBufferSerializerOptions(FlatBufferDeserializationOption.GreedyMutable), typeModels); } ```

It would be nice to have something similar built in, but without having to deal with the CodeGen-part. Like a pre-processor step on top of/before the actual type models (i.e type converters).

My knowledge of CodeGen is limited. But shouldn't it be possible to create an abstract class with two methods to override (e.g ConvertFrom and ConvertTo) and then have them called using code generation?

The reason I'm asking is because I tried to call both both static and non-static in the same class (using Namespace.ClassName.Method and this.Method`) but it didn't work out, I'm probably missing some contextual information. It would be really useful for our purposes to be able to do so; as writing "magic code strings" makes this so much harder.

jamescourtney commented 3 years ago

That's actually useful code, and with your permission, I think I'd like to use some variation of it.

The background on "why does flatsharp generate C#" is interesting. The very first version (if you rewind this repo back to the first commit) used IL.Emit instead of C# code gen. It's been a few years, but I made that change for a few reasons that I recall today:

So, as a result of this, we are left with some magic in the code as a result of having to do codegen, and I've just never seen a "clean" way to do codegen. The upshot is that the FlatSharp Compiler (FBS to C#) uses exactly the same code gen as the FlatSharp Runtime (Classes with attributes to C#).

But popping back up, I think this conversion utility is a great idea. I think I'd want to do it with some flavor of Expression trees using lambdas:


typeModelContainer.AddAlias<DateTimeOffset, long>(
   new LongTypeModel(),
   dto => dto.ToUnixTimeMillis(), 
   v => DateTimeOffset.FromUnixTimeMillis(v));
mausworks commented 3 years ago

That's actually useful code, and with your permission, I think I'd like to use some variation of it.

Go for it, I'm happy to have contributed in some way!

  • Roslyn allows you to compile and load a DLL at runtime.

Roslyn is really cool, there's so much magic I'm not (yet) aware of.

I think your example looks like a great feature for this library. Most conversions are pretty trivial to perform and your proposed pattern looks like it will be able to handle most cases (at least that I can think of). It can probably even be added as an extensions if you don't want it to be part of core functionality.

However, I'm not sure "alias" is the best word to encapsulate this idea, as I feel it's more of a runtime conversion, than it is an alias-- for me "alias" means "the same thing, but with another word".

As a user of this library, I feel that a very intuitive implementation would look like this:

typeModels.AddConversion<long, DateTimeOffset>(
    from: millis => DateTimeOffset.FromUnixTimeMilliseconds(millis),
    to: at => at.ToUnixTimeMilliseconds());

typeModels.AddConversion<long?, DateTimeOffset?>(
    from: millis => millis == null ? (DateTimeOffset?)null : (DateTimeOffset?)DateTimeOffset.FromUnixTimeMilliseconds(millis),
    to: at => at == null ? (long?)null : (long?)at.ToUnixTimeMilliseconds());

I'm omitting the "base model" from this example, as there should be some way to lookup which base model is required (that's the ITypeModelProvider, right?).

I'm also using parameters called "from" and "to", even though it's omnidirectional, just because I think that's the easiest way to conceptualize this.

A class pattern would be nice as well:

typeModels.AddConverter<UnixTimestampConverter>();

Said class would then consist of ConvertFrom and ConvertTo methods, and the generic implementation above could simply be a GenericTypeConverter<long, DateTimeOffset>.

I want typeModels.AddConversion<long, DateTimeOffset> to read like "Add conversion from long to DateTimeOffset", as this is how I approached the question-- long is what is already supported DateTimeOffset is what I want out of this.

jamescourtney commented 3 years ago

Thanks for your thoughts. I've prototyped this a bit more today, and I'm leaning towards going with Facade as the naming, because it's a thing that's really just a front for another thing.

The syntax is going to be roughly:

void RegisterTypeFacade<TUnderlyingType, TFacadeType, TConverterType>()
     where TConverterType : struct, IFacadeTypeConverter<TUnderlyingType, TFacadeType>

IFacadeTypeConverter will be defined simply:

public interface IFacadeTypeConverter<TUnderlyingType, TFacadeType>
{
    TUnderlyingType Convert(TFacadeType item);
    TFacadeType Convert(TUnderlyingType item);
}

The nuance here is that FlatSharp expects things to be statically linked (so to speak), and the easiest way to make the TConverter available where it needs to be is to force it to be a struct and just use default(TConverter).Convert(...).

You'll also be able to chain Facades together if you really want to make your life more difficult.

mausworks commented 3 years ago

Amazing work @jamescourtney, I like the idea of calling it a facade and I think the IFacadeTypeConverter looks good!

However, what about just ITypeFacade, it's the facade itself that converts, right? Facade.Convert plays well in my book!

You'll also be able to chain Facades together if you really want to make your life more difficult.

Yay!

jamescourtney commented 3 years ago

This is added in #87 . I need to play with it a little more, but I'll probably commit it in the next day or so.

mausworks commented 3 years ago

Really cool! I hope you don't mind me dropping a review, it's also for learning purposes.

jamescourtney commented 3 years ago

This has been merged into master and is available in FlatSharp version 4.2.2. Thanks for the feedback, @mausworks!