Arshia001 / FSharp.GrpcCodeGenerator

A protoc plugin to enable generation of F# code + supporting libraries
MIT License
81 stars 9 forks source link

Project is not conforming to the proto3 spec #9

Closed purkhusid closed 2 years ago

purkhusid commented 3 years ago

I've been diving a bit deeper into the proto3 spec and it seems like wrapping scalars in options is not according to the spec.

Until recently Proto3 did not have any way to track presence for scalar types and therefore it is not possible to know if an int32 is set to 0 or if the value was not set at all since the default value for int32 is 0.

People worked around this by using the wrappers.proto well-known types: https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/wrappers.proto This allows proto generators to have a special case where if a type is wrapped in a well known wrapper type it will be unwraped and e.g. wrapped in an option instead.

Recently the optional keyword was also introduced in proto3 as a way to track presence. The wire protocol represents the optional fields as a single case oneof but generators can unwrap the oneof and represent the field in a way that is idiomatic to their programming language.

I propose that we make the following changes to the generator:

  1. Non-optional scalars will no longer be wrapped in option since there is no way to know if a value was set intentionally or is just set to the default value.
  2. Unwrap the optional fields single case union and represent them as a option type
  3. (Optional) Add support for the well-known wrappers type

    What do you think @Arshia001 ?

Arshia001 commented 3 years ago

I'm generally unaware of the spec, but what I do know is that we can detect the presence of scalar fields very simply. After all, the binary message either includes the field or it doesn't.

Supporting the well-known wrappers types is a great idea. I'd actually have done it, had I not been in a hurry to finish the code generator. However, I see no reason why we should complicate matters more than need be; we can detect presence of scalars and we can represent that presence easily. Why shouldn't we do it?

I'm going to look into the C# code generator and see what it does.

Arshia001 commented 3 years ago

It doesn't do very nice things... This is my .proto:

message Inner {
    int32 int = 1;
}

message TestFieldPresence {
    string str = 1;
    Inner inner = 2;
}

And here's a few lines from the generated code:

    private string str_ = "";
    [global::System.Diagnostics.DebuggerNonUserCodeAttribute]
    public string Str {
      get { return str_; }
      set {
        str_ = pb::ProtoPreconditions.CheckNotNull(value, "value");
      }
    }

      if (Str.Length != 0) {
        output.WriteRawTag(10);
        output.WriteString(Str);
      }

      if (Str.Length != 0) {
        size += 1 + pb::CodedOutputStream.ComputeStringSize(Str);
      }

      if (other.Str.Length != 0) {
        Str = other.Str;
      }

For the life of me, I can't understand why they did it this way. The simplest solution would have been to add a ***Present flag for each field. Using language defaults to detect presence though? Ugh.

Anyway, I still don't see why we shouldn't wrap everything in options. Can you think of a scenario where this would cause problems?

Arshia001 commented 3 years ago

One more observation: This behavior of checking for presence via language defaults is built into the code generators. The binary protocol actually supports writing and reading empty or zero values, as evidenced by this code:

    let i = { Inner.empty() with Int = ValueSome 0 }
    use str = new System.IO.MemoryStream()
    i.WriteTo(str)
    str.Position <- 0L
    let i = Inner.Parser.ParseFrom(str)
    printfn "%O" i.Int // prints 0
purkhusid commented 3 years ago

If I'm interpreting the docs here correctly, then the proto3 wire format can not distinguish between a default value (e.g. 0 for int32) and a user setting the field to the default value.

This would mean that if I have the following message

message Test {
    int32 test_int = 0;
}

And set the test_int field to 0 and then send it over the wire, then the recipient would end up with ValueNone instead of ValueSome(0).

Are you saying that it is not like that? If we can actually discern between default values and intentionally set values, then there is no need to change anything. I definitely prefer having all fields as options than not.

I just started looking into this since I had to use the Rust proto implementation(https://github.com/tokio-rs/prost) and even though Rust supports options they do not wrap scalar types in options.

purkhusid commented 3 years ago

I can look into sending a message from e.g. the Rust or Go generator and then deserialize it with the F# generator and see how the default values work between the implementations. Can look into it after the weekend.

purkhusid commented 3 years ago

I also figured that this ruins interop with languages like Rust where there is no null and therefore when you construct a proto message you have to set the default values since you can't skip fields in the message.

purkhusid commented 3 years ago

I updated #8 to include unwrapping of proto3 optional types.

purkhusid commented 3 years ago

I confirmed that if I create a proto message in go and serialize it and then deserialize it in F# then I will always get ValueNone even if I set the field to the default value on the go side.

To make this generator play nicely with other generators we will have to conform to the spec and have the same default values as specified here: https://developers.google.com/protocol-buffers/docs/proto3#default

I've got a PR that is based on #8 that changes the default values: #12 It includes all the changes from this PR as well since I didn't want to conflict if you want to accept both

Arshia001 commented 3 years ago

Thank you. I will merge these changes eventually, but you have a lot of code in there, and reviewing it can take quite some time...

siwatanejo commented 2 years ago

Hey guys, I wonder if this project can be simplified (and as a consequence might have fewer bugs like this) by taking a dependency on https://github.com/grpc/grpc-dotnet ? I know that proj is very C#-centric, but you could ignore the code generation part and use the rest of the bits?

purkhusid commented 2 years ago

This project would not change very much in complexity since the only thing that is done on our side is code generation and creating a F# friendly wrapper for the previous incarnation of C# gRPC: https://www.nuget.org/packages/Grpc.Core

I want to migrate this project to use grpc-dotnet in the not so far future but don't have the capacity to do so at the moment and I would like to get https://github.com/Arshia001/FSharp.GrpcCodeGenerator/pull/12 merge first.

I'm hoping @Arshia001 shows up at some point to review the changes. I'm also up for being added as a co-maintainer if @Arshia001 would agree to that. My company uses this library quite a lot in production and I would love to upstream all the patches we do to it.

Arshia001 commented 2 years ago

Closed in #16.