coleifer / peewee

a small, expressive orm -- supports postgresql, mysql, sqlite and cockroachdb
http://docs.peewee-orm.com/
MIT License
11.17k stars 1.37k forks source link

Deleting items from table with JOIN in a query #2906

Closed Pennywise007 closed 4 months ago

Pennywise007 commented 4 months ago

Hi, I got 2 tables MainTable and SecondaryTable and they connected like

class SecondaryTable(AbstractTable):
    id: AutoField()
    link: Union[Maintable, ForeignKeyField] = ForeignKeyField(model=MainTable, null=True)
   ....

I want to remove items from the MainTable which is not presented in the SecondaryTable. For getting a list of rows to delete I have a query:

query = (MainTable
                .select()
                .where(MainTable.created_at <= int(expiration_time.timestamp()))
                .join(SecondaryTable, JOIN.LEFT_OUTER, on=(MainTable.id == SecondaryTable.link))
                .where(SecondaryTable.id >> None))

I was sure that just replacing 'select` -> 'delete' should be sufficient. But it is complaining about 'join' statement.

I also tried to MainTable.delete().where(MainTable.id.in_(query)).execute() But I receive an error: "You can't specify target table 'maintable' for update in FROM clause"

I tried to do an old fashioned way with: ids_to_delete = [item.id for item in query] but in my case this table got million rows to delete and my lambda where I run it got a timeout error.

I can increase a timeout or iterate over ids and remove them 1 by 1 but it doesn't sounds like a production solution. In ideal world I want to remove items in a single request. Can you please help me with solving this issue?

coleifer commented 4 months ago

For one, DELETE with JOIN is a MySQL only thing, and we don't currently support that syntax. So you will have to use a subquery or a common-table expression. MySQL additionally has the limitation you mentioned. You can see some additional info here: https://stackoverflow.com/questions/45494/mysql-error-1093-cant-specify-target-table-for-update-in-from-clause

It looks like whether using a CTE is supported depends on the MySQL you are using so that may or may not work.

For example this works just fine in Sqlite or Postgres:

for i in range(10):
    m = Main.create(name='m%s' % i)
    if i % 2 == 0:
        Second.create(link=m)

cte = (Main
       .select(Main.id)
       .join(Second, JOIN.LEFT_OUTER)
       .where(Second.id.is_null())
       .cte('outer'))

subq = Select((cte,), (cte.c.id,))

(Main.delete()
 .where(Main.id.in_(subq))
 .with_cte(cte)
 .execute())

MariaDB 11.3 however complains of a syntax error, presumably because WITH...DELETE is not supported.

That said, the following seems to work fine in MariaDB 11.3:

subq = (Main
        .select(Main.id)
        .join(Second, JOIN.LEFT_OUTER)
        .where(Second.id.is_null()))

(Main.delete()
 .where(Main.id.in_(subq))
 .execute())
Pennywise007 commented 4 months ago

I still reach the timeout with 1st approach and am not fully sure that it works... The second approach is the same as the code which I mentioned MainTable.delete().where(MainTable.id.in_(query)).execute() and doesn't work for me.

Any other options? I think this one works:

query = MainTable
                .select()
                .where(MainTable.created_at <= int(expiration_time.timestamp()))
                .join(SecondaryTable, JOIN.LEFT_OUTER, on=(MainTable.id == SecondaryTable.link))
                .where(SecondaryTable.id >> None).delete_instance(recursive=True)

At least it was of my experiments, but it removes only 1 instance. If it can remove 1 maybe it can remove all somehow?

coleifer commented 4 months ago

You're probably best off just writing the query by hand honestly and using db.execute_sql(...).

Pennywise007 commented 4 months ago

If I can do it through the db and can't do it with your project, it means that this issue can take place.

coleifer commented 4 months ago

That said, the following seems to work fine in MariaDB 11.3:


subq = (Main
.select(Main.id)
.join(Second, JOIN.LEFT_OUTER)
.where(Second.id.is_null()))

(Main.delete() .where(Main.id.in_(subq)) .execute())