ctaggart / froto

Froto: F# Protocol Buffers
MIT License
146 stars 27 forks source link

UnknownFields in the serializer seems to be unused, can this be removed or is it unimplemented? #84

Closed 7sharp9 closed 6 years ago

jhugard commented 6 years ago

Not using the UnknownFields list in the Serializer tests was an oversight. The intention with this list is to allow pass-thru of fields that were added after the code was written, in order to survive protocol evolution.

So, the deserializer will record such fields on UnknownFields, then the generated serializer should be written to reintroduce those fields to the binary output, preserving the complete and possibly modified message (if not original order, which IIRC is OK per the spec).

E.g., these lines should probably be generated as:

            static member Serializer (m, zcb) =
                (m.id            |> Encode.fromVarint 1) >>
                (m.name          |> Encode.fromString 2) >>
                (m._unknownFields|> Encode.fromRawFields)
                <| zcb

Note the introduction of (m._unknownFields|> Encode.fromRawFields).

7sharp9 commented 6 years ago

So all the unknowns are at the end of the stream, ok.

Im not sure when yet, but I could add this to the records serialization test and serialization.fs file at some point.

7sharp9 commented 6 years ago

Its a pity that found and unknown fields have to be part of the record definition it makes the construction of the port record a little more tedious:

e.g. It would be nice to not have _unknownFields and _foundFields I suppose theres no way around this with the record implementation in F#.

let bundle = { martId = 42
               member_id = "memberId"
               channel_type = "myChannel"
               skus =  ["retailsku1"; "retailsku2"]
               _unknownFields = []
               _foundFields = Set.empty }
jhugard commented 6 years ago

Im not sure when yet, but I could add this to the records serialization test and serialization.fs file at some point.

Cool. See the proto3 and proto2 language guides for notes about handling unknown fields.

Also note that while the example Serializer code does preserve order of repeated fields, it does not preserve order between different fields: it always writes in the order defined in the Serializer function (generally ascending field number order). Pretty sure this is consistent with other protobuf implementations.

jhugard commented 6 years ago

Its a pity that found and unknown fields have to be part of the record definition it makes the construction of the port record a little more tedious

You could also use Default, which allows the record to be expanded later without necessitating code changes (the new fields would use defaults). But then the compiler won't point out where your code needs to be updated to handle those new fields, either:

let bundle = { Bundle.Default with
                 martId = 42
                 member_id = "memberId"
                 channel_type = "myChannel"
                 skus =  ["retailsku1"; "retailsku2"] }
jhugard commented 6 years ago

e.g. It would be nice to not have _unknownFields and _foundFields I suppose theres no way around this with the record implementation in F#.

Well, if your codegen had an option to generate proto3 specific serdes, then you could drop both: preserving unknown fields isn't mandatory for proto3, and proto3 does not support the required keyword, so _foundFields isn't needed.

Probably obvious, but then:

jhugard commented 6 years ago

Not using the UnknownFields list in the Serializer tests was an oversight.

Update: the ExampleProtoRecord.fs does include serialization of unknown fields. See here (I thought I had an example of that! GitHub search fail.).

7sharp9 commented 6 years ago

Yeah I did write the copy and update method with the record, it just feels a bit clunky thats all.

Im only really interested in the proto3 version to be honest, I think one of the C# implementations dropped all the proto2 parts for simplicity.

Im trying to thinks of a way of removing the SRTP constraints so that the RememberFound, FoundFields and UnknownFields are no longer required in the record definition.

7sharp9 commented 6 years ago

I wonder if I could add a proto3 specific module for the serialization of proto3 only, i.e. remove the constraints. Alternatively the constrains could just be proto2/proto3 interfaces that could be used rather then SRTP constraints.

That way I could generate serialize/deserialize that were more compact and made just for proto3. Im just thinking out loud really as I can do all this in my fork but then it won't allow me to contribute back here.

jhugard commented 6 years ago

Proto3 files should/must have syntax='proto3'; near the start of the file and I'm certain I record this as part of the AST. You could gate codegen based on this to either provide UnknownFields (and Required Fields) support, or not, based on that flag.

7sharp9 commented 6 years ago

ok, let me put something together.

7sharp9 commented 6 years ago

Technically proto2 messages can be constructed with no UnknownFields as there is no constraint to require this, its entirely up to the implementor or codegen.

7sharp9 commented 6 years ago

So the serializer method would have to change to something like this to enforce it:

    /// Serialize message into a ZeroCopyBuffer.
    let inline toZeroCopyBuffer m zcb =
        (^msg : (static member Serializer : ^msg * ZeroCopyBuffer -> ZeroCopyBuffer) (m, zcb))
        let unknownFields = (^msg : (static member UnknownFields: ^msg -> List<RawField>) (m) )
        Encode.fromRawFields unknownFields zcb 
jhugard commented 6 years ago

Fixed in commit ca65a0f6974b4ee9e4f0c143e620428f6b9e7e82