Closed jerch closed 2 years ago
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
computedfields/resolver.py | 36 | 40 | 90.0% | ||
computedfields/helper.py | 15 | 20 | 75.0% | ||
<!-- | Total: | 70 | 79 | 88.61% | --> |
Totals | |
---|---|
Change from base Build 2169488544: | -0.9% |
Covered Lines: | 1109 |
Relevant Lines: | 1153 |
The querysize is a rather ugly issue, and needs to be applied in a bigger scope - basically any update_dependent
call might pull a really big queryset deep down in the deps tree consuming GBs of RAM.
Example:
This can already be provoked on the dummy models in exampleapp
with 1000 records of Foo
linking to 100 records of Bar
each (100k total), again linking to 10 Baz
each (1M total). Doing a update_dependent(Foo.objects.all())
with all fields out of sync needs ~2GB RAM in python with sqlite. Note the huge mem usage mainly comes from the select_related
joins, but without these, the perf goes really down.
Solution:
update_dependent
level with qs.iterator(querysize)
bulk_updater
to a multi side call digesting the batches, do the tree descent in update_dependent
insteadupdate_dependent
apply lowest querysize setting from used cfsDownside - This will sacrifice some performance in highly intersecting updates, now leading to repeated update attempts on depending records. Well, it's adjustable, so more an issue of lousy defaults.
Hmm iterator()
is tricky to use, as it side-paths the queryset caching, which runs worse in deps chains, as we have to eval the queryset multiple times, at least once more for pk extraction. This either needs a rewrite of tree descent lookup in _querysets_for_update
, or a cheaper goal would use queryset slices instead (which also should preserve prefetch
rules, not tested yet...)
Edit:
Solution is to use both iterator
and slice notation, the latter being used in case we have found a prefetch optimization.
With the new slice_iterator
and a reasonable COMPUTEDFIELDS_QUERYSIZE setting RAM usage can now be controlled pretty nicely. There is still a chance for higher consumption in deeper nested dependencies (we have to hold the symbols during descent)
The example from above with COMPUTEDFIELDS_QUERYSIZE=2000 (default value), usage never goes higher than 80MB. On a sidenote - the runtime of the loop variant benefits most from the smaller queryset footprints, with all fields desync:
There is one more catch in the current loop variant - it saves the instance no matter if the fields were in sync or not. This could be optimized the same as bulk_updater
does internally. The difference is rather huge, when running updatedata
with all fields in sync:
Comparing these numbers with update numbers above it seems, that for fast
the update writing took only ~20s, while most of the time was spent querying the database for old values. In this light bulk
looks pretty bad (writing takes ~4 min), for loop
we dont know the difference yet.
Performance-wise the holy grail for updatedata
would be, if it could figure out a BSF-flavored topsort on tables/model fields. For the example above in fact all it would have to do, is an update_dependent(Foo.objects.all())
, which finishes in 00:38 (all desync) resp. 00:24 (all sync) with fast
. But thats very tricky to get done right (not even possible for all cases), and needs fundamental changes in the graph handling. Thus it is left for a later exercise.
The numbers above are for updating/cascading CFs only on selected models (equal to calling for model in models: update_dependent(model.objects.all(), update_fields=cfs_on_model)
), which is prolly surprising/unwanted behavior. The old updatedata
command simply did a full for inst in qs: inst.save()
roundtrip without any field narrowing.
The last commit restores that behavior, as it is much easier to comprehend (but also much much more expensive for the loop variant).
At this point I think the semantics of updatedata
should stick to the "trigger updates FROM all fields" idea, as it is directly compatible to for model in models: update_dependent(model.objects.all())
.
A third variant would do full "horizontal" updates on models without any descent (in conjunction with proper topsort from above also prolly the fast one). Big limitation - would only work for non-cycling deps, and only for full project-wide updates. Anything less would introduce desyncs from updatedata
itself.
WIP for a better
updatedata
command.Planned:
bulk_update
(most of the time significantly faster, but not always)fast_update
(speedy as hell)bulk_updater
neededcheckdata
command to get an idea of the sync state of cfsshowdependencies
commandupdatedata
Fixes #32.