myyang / django-pb-model

Protobuf mixin for django model
Other
112 stars 22 forks source link

Double save #32

Closed akionakamura closed 2 years ago

akionakamura commented 2 years ago

We use Debezium to capture data changes from our DB (Postgres). With that, we detected that every insert actually runs one insert and one update operation. Looking at the source code, we have found that there are two super.save calls:

    def save(self, *args, **kwargs):
        super(ProtoBufMixin, self).save(*args, **kwargs)
        for m2m_field in self._meta.many_to_many:
            if issubclass(type(m2m_field), fields.ProtoBufFieldMixin):
                m2m_field.save(self)
        kwargs['force_insert'] = False
        super(ProtoBufMixin, self).save(*args, **kwargs)

This hurts performance a bit. So I have two questions:

  1. What was the historical motivation to add this two save calls? Perhaps this is a bug and we only need to have another call in case there is m2m_field saved?
  2. If there is any other reason, can we find another work around?
myyang commented 2 years ago

Hi @akionakamura,

Thanks for your information.

  1. What was the historical motivation to add this two save calls?

Although I haven't work with Django for these years, the purpose of 2nd save() is used to refresh data from database. While I was working on the project, I found following situations:

# 1
def save(self, *args, **kwargs):
        for m2m_field in self._meta.many_to_many:
            if issubclass(type(m2m_field), fields.ProtoBufFieldMixin):
                m2m_field.save(self)   # problem: raise no reference key
        super(ProtoBufMixin, self).save(*args, **kwargs)

# 2
def save(self, *args, **kwargs):
        super(ProtoBufMixin, self).save(*args, **kwargs)
        for m2m_field in self._meta.many_to_many:
            if issubclass(type(m2m_field), fields.ProtoBufFieldMixin):
                m2m_field.save(self)
         # problem: m2m field  is not up-to-dated

So I just naively added the 2nd save() for refreshing then.

Perhaps this is a bug and we only need to have another call in case there is m2m_field saved?

I'm not sure about the mechanism of current Django version. If the situation # 2 above is no longer existed, then it should be a bug.

  1. If there is any other reason, can we find another work around?

Maybe overwrite it in subclass if this is critical issue to your business now 😢 I might need some time to rebuild the environment at local and go rough the problem.. If you could help, PR is always welcome 👍

akionakamura commented 2 years ago

Hi @myyang, thanks for the answer!

If I remember correctly, we have run into one-to-many/many-to-many lose reference when batching saving before. I'll investigate more and see if the solution also applies here.