tortoise / tortoise-orm

Familiar asyncio ORM for python, built with relations in mind
https://tortoise.github.io
Apache License 2.0
4.37k stars 355 forks source link

refactor: optimize m2m add logic #1620

Closed waketzheng closed 1 month ago

waketzheng commented 1 month ago

Description

Fixes TODO item and make code more clear

Motivation and Context

The pk_formatting_func function only need to be call once

How Has This Been Tested?

make ci

Checklist:

abondar commented 1 month ago

Although this refactoring seems good to me and I am okay with accepting it as is - I am not sure if anything changed regarding concern in TODO?

As I understand it - it was about the fact that we needed to have indexes in m2m through table, and after refactoring table is still same and there is no unique index there

coveralls commented 1 month ago

Pull Request Test Coverage Report for Build 9236186430

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details


Files with Coverage Reduction New Missed Lines %
tortoise/fields/data.py 15 94.69%
<!-- Total: 15 -->
Totals Coverage Status
Change from base Build 9227277964: 0.001%
Covered Lines: 5772
Relevant Lines: 6471

💛 - Coveralls
waketzheng commented 1 month ago

Although this refactoring seems good to me and I am okay with accepting it as is - I am not sure if anything changed regarding concern in TODO?

As I understand it - it was about the fact that we needed to have indexes in m2m through table, and after refactoring table is still same and there is no unique index there

@abondar done.

from tortoise import Tortoise, fields, models, run_async

def timeit(func) -> Callable: async def runner(*args, *kw) -> None: start = time.time() await func(args, **kw) end = time.time() print(func.name, "Cost:", round(end - start, 1))

return runner

class Team(models.Model): id = fields.IntField(pk=True) name = fields.CharField(20)

class Event(models.Model): id = fields.IntField(pk=True) name = fields.CharField(20) teams = fields.ManyToManyField("models.Team")

async def main() -> None: await Tortoise.init(db_url="sqlite://:memory:", modules={"models": ["main"]}) await Tortoise.generate_schemas() await _test()

@timeit async def _test() -> None: n = 100_000 instances = [] for m in (Team, Event): objs = [m(id=i + 1, name=i) for i in range(n)] await m.bulk_create(objs) instances.append(await m.all()) for t, e in zip(*instances): await e.teams.add(t) await e.save()

if name == "main": run_async(main())

Creates 100 thousand records for team, event, event_teams.

Output of branch develop:
_test Cost: 287.6

While output of branch refactor-m2m-add is:
_test Cost: 66.7

To reproduce:
```bash
git clone git@github.com:waketzheng/tortoise-orm
cd tortoise-orm
poetry shell
make deps
python main.py
git checkout refactor-m2m-add
make deps
python main.py