myyang / django-pb-model

Protobuf mixin for django model
Other
112 stars 22 forks source link

[Proposal] Allow Protobuf message as a Django Field #26

Closed SilvioMachado closed 4 years ago

SilvioMachado commented 4 years ago

Hey, while working with Kafka and the Outbox Pattern, i ended up working with a message that had Any type as one field. I used this library to make that message into a Django Model and i made the code be able to save and load from database. For this case and maybe others in the future it may be beneficial have something already implemented in the library.

E.g.

message Event { 
  Any payload = 1;
  ...
}

In order to accomplish this i'm proposing:

myyang commented 4 years ago

Hi, does custom field meet your requirements? 🤔

after reading your proposal, I setup following sample to align view point of this lib.

message Event {
   Any anyfield1 = 1;
   Any anyfield2 = 2;
}

with corresponding models


def _anyfield1_to_pb(pb_obj, pb_field, dj_field_value):
      # pack to pb

def _anyfield1_from_pb(instance, dj_field_name, pb_field, pb_value):
    # pack from pb

class AnyField2(models.Field):
    """ mock field """"
    def to_custom_bytes(self):
        # return custom bytes
    def from_custom_bytes(self, bytes):
        # set custom bytes

def _anyfield2_to_pb(pb_obj, pb_field, dj_field_value):
    setattr(pb_obj, pb_field, dj_field_value.to_custom_bytes())

def _anyfield2_from_pb(instance, dj_field_name, pb_field, pb_value):
    f = AnyField2()
    f.from_custom_bytes(pb_value)
    setattr(instance, dj_field_name, pb_value)

class Event(ProtobufMixin, models.Model):
    pb_model = models_pb.Event

    pb_2_dj_field_serializers = {
         'anyfield1': (_anyfield1_to_pb, _anyfield1_from_pb),
         'anyfield2': (_anyfield2_to_pb, _anyfield2_from_pb),
    }

    anyfield1 = models.BinaryField()
    anyfield2 = AnyField2() 

According to above sample:

Create a custom Django Field subclassing BinaryField in order to be able to save Protobuffers.

Supported with pb_2_dj_field_serializers and corresponding func pairs.

Have a custom option on the Field to allow the developer to define the type to serialize the proto to, when getting the value from the database.

I'm not quite sure, is this same as the pb_2_dj_field_serializers' keys utility ?

Have custom Serializers to make from_pb and to_pb to work.

Supported with pb_2_dj_field_serializers and corresponding func pairs.

do I misunderstand ? or any thoughts about this?

SilvioMachado commented 4 years ago

Hey, thanks for such a fast response. I really hope i'm not misunderstanding something here, but, i did do something close to your snippet. The custom fields enable me to serialize a field into another model (like you seem to have done to anyField2), which is good, but i chose to left the field it as a Proto.

My use case is like the anyField1 you have there, is an Any type field on the Proto message, it's a BinaryField on the Django model declaration, but when saving, it raises TypeError: can't escape Any to binary.

What i have done is to take matters of escaping to and from bytes myself:

class ProtobufAnyField(BinaryField):
    def pre_save(self, model_instance, add):
        anyfield1 = model_instance. anyfield1

        if not isinstance(anyfield1, Any):
            raise ValueError('Field `anyfield1 ` is not of type google.rpc.any_pb2.Any')

        return payload.SerializeToString()

    def _parse_proto(self, value):
        _any = Any()
        _any.ParseFromString(value)
        return _any

    def from_db_value(self, value, expression, connection):
        # Allow value to be None
        if value is None:
            return value

        return self._parse_proto(value)

In my example, if anyone tries to access myModel.anyField1 they'll get a proto instance. Also the example is enforcing Any but this can be a free choice.

myyang commented 4 years ago

Ah, I see. Sounds good to support Any type.
So the final usage might look like:


from pb_model.fields import ProtobufAnyField

def save_any_to_db():
    # to db column

def load_any_from_db():
    # from db record

def _anyfield_to_pb(pb_obj, pb_field, dj_field_value):
      # pack to pb

def _anyfield_from_pb(instance, dj_field_name, pb_field, pb_value):
    # pack from pb

class AnyExample(models.Model):
    pb_model = models_pb2.AnyExample

    pb_2_dj_field_serializers = {
         'anyfield': (_anyfield_to_pb, _anyfield_from_pb),
    }
    anyfield = ProtobufAnyField(
        to_db=save_any_to_db, 
        from_db=load_any_from_db, 
        return_type=AnyField.Return.[Any|Bytes],
    )

By above example, the any field should be converted smoothly among django, db and protobouf.


I have 2 questions about AnyField.

  1. Since auto-gen is supported, the default action of ProtobufAnyField fallbacks to BinaryField is a good idea?
  2. I haven't used Any type in protobuf, does repeated Any field = 1; or related compound type should be considered now?
SilvioMachado commented 4 years ago

Yes the final usage you declared there is almost how i see it, i just think that the to_db and from_db does not need to be open to the public, or at least they should have a default unless someone wants to override it.

While working on this. the only use case that seems to make sense is to use this for the Any type, so i don't know if allowing the developer to decide on a return type should be allowed, unless you just make him decide on receiving Any or just the Bytes of Any.

now:

  1. I think making BinaryField as default is a good idea since storing and transporting protobuffers are supposed to be in bytes and their serialization is very performatic.
  2. I think repeated is something that we may need to add as well, it does seem like a logical and obvious evolution of this proposal, but just for starters we may not support repeated to simplify things, and do that on a follow up.
myyang commented 4 years ago

i just think that the to_db and from_db does not need to be open to the public, or at least they should have a default unless someone wants to override it.

Yes, that's why I ask the question 1, the demo just shows flexibility.

While working on this. the only use case that seems to make sense is to use this for the Any type, so i don't know if allowing the developer to decide on a return type should be allowed, unless you just make him decide on receiving Any or just the Bytes of Any.

Hmm... you're right. I think I overdesign the field. If someone needs bytes type, use BinaryField or just ignore the any field while conversion if not used rather than return any type. And so does the compound type. Let's make this field simple as well for first implementation.


If you already have implementation or prototype for your use case, PR is welcome 😄

SilvioMachado commented 4 years ago

Hey @myyang just created the Pull Request #28