ahamez / protox

A fast, easy to use and 100% conformant Elixir library for Google Protocol Buffers (aka protobuf)
MIT License
269 stars 19 forks source link

[QUESTION] Enums implementation #81

Closed jacek-adamek closed 3 years ago

jacek-adamek commented 3 years ago

Right now enums are implemented in a way which allows delivering an arbitrary value, out of the expected set which makes them a bit less reliable than they could be. What I mean is that in the generated code the decode and encode functions have clauses which accept all values and return them as they are. For example:

...
@spec encode(atom()) :: integer() | atom()
[
  (
    def(encode(:DISABLED)) do
      0
    end

    def(encode("DISABLED")) do
      0
    end
  ),
  (
    def(encode(:ENABLED)) do
      1
    end

    def(encode("ENABLED")) do
      1
    end
  )
]

def(encode(x)) do
  x
end
...

@spec decode(integer()) :: atom() | integer()
[
  def(decode(0)) do
    :DISABLED
  end,
  def(decode(1)) do
    :ENABLED
  end
]

def(decode(x)) do
  x
end
...

Wouldn't be better if we either got rid of these last clauses or raised exceptions from them?

ahamez commented 3 years ago

Hello! Good question :-) In fact, we have to do this to support forward compatibility. Imagine you have an application that has a more recent version of the protobuf definition, with a new entry in the enum. Then, an application with an older version of the definition (thus without the new entry) must be able to parse the new entry without an error. Does it answer your question?

jacek-adamek commented 3 years ago

Ha, this makes sense, you're right. But I am still wondering if we really need this last clause for the encode function? If we got rid of it then, when you create a new message, you could be sure that you use one of the expected values. And if we kept the last clause of the decode function, the receiver could parse it without any errors. Wouldn't that be forward compatible?

ahamez commented 3 years ago

It's the same thing with encode: any unknown enum entry that is received should be sent as is, without tempering with its content (this is the case if your app with the old definition has to transfer the message to another app). I hope this is clear enough πŸ™‚! I think I'll add a note about this in the README.

jacek-adamek commented 3 years ago

Thanks for you patience and quick responses. I really appreciate it! ❀️

I've just played a bit with Protox in a repl and realised where my problem lies. That's probably isn't about encode or decode but about json serialisation. Maybe I give a bit of context why I asked these questions at all. I have some proto messages which are already in production and I'd like to avoid a situation when I deliver a wrong value just because of a typo I made. Besides, I've noticed that serialising to proto and json formats behave a bit differently. This is an example coming from the production code, although a bit obfuscated:

First case, I deliver a correct value:

iex(15)> %Entity{status: :ENABLED} |> Entity.json_encode
{:ok, ["{", ["\"status\"", ":", "\"ENABLED\""], "}"]}
iex(16)> %Entity{status: :ENABLED} |> Entity.encode
{:ok, [[], "0", <<1>>]}

Both versions return sensible results and I am happy. Second case, I deliver incorrect value:

iex(17)> %Entity{status: :YOGI} |> Entity.json_encode
{:ok, ["{", ["\"status\"", ":", "\"YOGI\""], "}"]}
iex(18)> %Entity{status: :YOGI} |> Entity.encode
{:error,
 %Protox.EncodingError{
   field: :status,
   message: "Could not encode field :status (invalid field value)"
 }}

And here I see some kind of inconsistency, since the json version accepts incorrect values. This worries me a bit, since I use the json serialisation even more often than the binary one. To make things even worse, when I try to decode this wrongly created json I get an error as well:

iex(29)> {:ok, iodata}  = %Entity{status: :YOGI} |> Entity.json_encode
{:ok, ["{", ["\"status\"", ":", "\"YOGI\""], "}"]}
iex(30)> iodata |> Entity.json_decode
{:error,
 %Protox.JsonDecodingError{
   message: "Could not decode JSON payload: YOGI is not value of Elixir.ProtobufSchema.EntityStatus"
 }}

As you can see the problem with the incorrectly formed json messages arises only when it is deserialised, which seems a bit too late. I believe, the sender should know that a message includes incorrect data and for that reason shouldn't be sent at all.

I don't know if we can do anything about that, but I hope I managed to point out some potential problems relating to the json serialisation (binary one works as expected) πŸ™‚

ahamez commented 3 years ago

Aah, yes, I feel the same… It fails in the binary version is that because there's no way of converting :YOGI to an integer (enum are transmitted as integers in the binary format). However, the JSON specification expects the enum entry names to be outputted as string (thus ENABLED rather than 0). This means that YOGI could have been received from an app with a more recent protobuf definition. And thus, this value must be kept when serializing back to JSON.

ahamez commented 3 years ago

But! I realize that you have :YOGI (the atom) and not "YOGI" (the string) in %Entity{status: :YOGI}, which is not exactly the same scenario as I describe above. I guess it's possible to check that this atom doesn't belong the enum, I'll see what I can do!

ahamez commented 3 years ago

So, it seems it was quite easy to implement πŸ™‚. Can you try the branch dont_json_encode_wrong_enum_entries and tell me if it works as expected for you?

jacek-adamek commented 3 years ago

Great, works as expected πŸ™‚

iex(4)> %Entity{status: :ENABLED} |> Entity.encode
{:ok, [[], "p", <<1>>]}
iex(5)> %Entity{status: :ENABLED} |> Entity.json_encode
{:ok, ["{", ["\"status\"", ":", "\"ENABLED\""], "}"]}
iex(6)> %Entity{status: :YOGI} |> Entity.json_encode
{:error,
 %Protox.JsonEncodingError{
   message: "Could not encode to JSON: invalid enum entry value"
 }}

However, there is one thing which must be pointed out: this change isn't backward compatible which means all elixir files must be regenerated. Otherwise an exception will be risen:

%Entity{status: :ENABLED} |> Entity.json_encode
** (UndefinedFunctionError) function EntityStatus.has_constant?/1 is undefined or private. Did you mean one of:

      * constants/0

Of course, it refers only to those users who use an option of generating elixir files.

ahamez commented 3 years ago

Thank you for your feedback! I'll release a new version as soon as possible.