maxtepkeev / architect

A set of tools which enhances ORMs written in Python with more features
Other
391 stars 57 forks source link

ERROR: relation already exists #1

Closed dn0 closed 9 years ago

dn0 commented 9 years ago

I had two cron jobs, both inserting some data into a postgres DB at the exact same time. One of them always failed to insert the data, because of a relation already exists error. I think that it makes sense to catch the duplicate_table exception and always try to insert the data, but I did not test the proposed patch.

coveralls commented 9 years ago

Coverage Status

Changes Unknown when pulling b35587f1cb4a38de7a0e1b29ec55e8f9872582a4 on dn0:master into \ on maxtepkeev:master**.

maxtepkeev commented 9 years ago

Hi!

Thanks for the report and the solution, but... this is a quote from PostgreSQL docs:

A block containing an EXCEPTION clause is significantly more expensive to enter and exit than a block without one. Therefore, don't use EXCEPTION without need.

So what about just changing CREATE TABLE to CREATE TABLE IF NOT EXISTS. Have you tried to do it this way ? If not can you please test it on your cron scripts and report back with results.

Thanks.

dn0 commented 9 years ago

Hello,

the CREATE TABLE IF NOT EXISTS solution sounds great (albeit only available in postgres>=9.1), but unfortunately it does not work. It is easy to replicate the problem by hand:

$ cd project

$ cat app/models.py
from django.db import models
from architect.orms.django.mixins import PartitionableMixin

# Create your models here.
class YourModelName(PartitionableMixin, models.Model):
    txt = models.TextField()
    created = models.DateTimeField(auto_now_add=True)

    class PartitionableMeta:
        partition_type = 'range'
        partition_subtype = 'date'
        partition_range = 'day'
        partition_column = 'created'
$ ./manage.py syncdb
Creating tables ...
...

$ architect partition --module app.models
architect partition: result: successfully (re)configured the database for the following models: YourModelName
$ python -c "from app.models import YourModelName; print(YourModelName.objects.values_list('txt', flat=True))"
[]
$ python -c "from app.models import YourModelName; x = YourModelName(txt='first'); x.save()" & \
> python -c "from app.models import YourModelName; x = YourModelName(txt='second'); x.save()"
[1] 32459
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/dano/envs/lib/python2.6/site-packages/architect/orms/django/mixins.py", line 52, in save
    super(PartitionableMixin, self).save(*args, **kwargs)
  File "/home/dano/envs/lib/python2.6/site-packages/django/db/models/base.py", line 545, in save
    force_update=force_update, update_fields=update_fields)
  File "/home/dano/envs/lib/python2.6/site-packages/django/db/models/base.py", line 573, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/home/dano/envs/lib/python2.6/site-packages/django/db/models/base.py", line 654, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/home/dano/envs/lib/python2.6/site-packages/django/db/models/base.py", line 687, in _do_insert
    using=using, raw=raw)
  File "/home/dano/envs/lib/python2.6/site-packages/django/db/models/manager.py", line 232, in _insert
    return insert_query(self.model, objs, fields, **kwargs)
  File "/home/dano/envs/lib/python2.6/site-packages/django/db/models/query.py", line 1514, in insert_query
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/home/dano/envs/lib/python2.6/site-packages/django/db/models/sql/compiler.py", line 903, in execute_sql
    cursor.execute(sql, params)
  File "/home/dano/envs/lib/python2.6/site-packages/django/db/backends/util.py", line 69, in execute
    return super(CursorDebugWrapper, self).execute(sql, params)
  File "/home/dano/envs/lib/python2.6/site-packages/django/db/backends/util.py", line 53, in execute
    return self.cursor.execute(sql, params)
  File "/home/dano/envs/lib/python2.6/site-packages/django/db/utils.py", line 99, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/home/dano/envs/lib/python2.6/site-packages/django/db/backends/util.py", line 53, in execute
    return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: relation "app_yourmodelname_y2014d218" already exists
CONTEXT:  SQL statement "CREATE TABLE IF NOT EXISTS app_yourmodelname_y2014d218 (
                        CHECK (
                            created >= '2014-08-06 00:00:00'::timestamp with time zone AND
                            created < '2014-08-07 00:00:00'::timestamp with time zone
                        ),
                        LIKE "app_yourmodelname" INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES
                    ) INHERITS ("app_yourmodelname");"
PL/pgSQL function app_yourmodelname_insert_child() line 17 at EXECUTE statement

[1]+  Exit 1                  python -c "from app.models import YourModelName; x = YourModelName(txt='first'); x.save()"
$ python -c "from app.models import YourModelName; print(YourModelName.objects.values_list('txt', flat=True))"
[u'second']

The thing about performance is that the current design of architect partitioning is generally not very suitable for tables where heavy writes are expected. The triggers will always come with some performance penalty. It is better to use some cron system to create partitions on tables with lots of writes, so that the triggers can be omitted. But I think that using architect on tables with not so many INSERTs/UPDATEs should be OK, even with the EXCEPTION clause.

maxtepkeev commented 9 years ago

It's very disappointing that CREATE TABLE IF NOT EXISTS doesn't work. I also found an explanation about why it doesn't work, in case you are interested. It looks like that your solution is the only one that really works, so I'm going to accept your pull request. Thank you very much for the detailed report and examples to reproduce the problem.

Speaking about architect performance, yes, you're absolutely right about trigger overhead, but I decided to use them in the first place because it's the only one really fully automatic way to create partitions. I definitely agree with you about using cron system for tables with a lot of write operations and I have plans to add this option to the architect in the future.