Closed alixinne closed 7 years ago
How did you do the deletion? If using admin interface, 7c103164371e8d6e9898d04e8e671d0ef4aabdc6 might have helped in this case.
Anyway deleting indeed shouldn't be that slow, these queries seem to come from string representation of the translation object...
I did the deletion through the admin interface, the measured times and requests only take into account actual deletion (when confirming after the preview stage).
My guess is that it has something to do with how Django handles model deletion. It looks like it's loading objects and their references before deleting them, which ends up loading for each unit, and associated change, their parent translation, sub-project and project.
It looks like it's more of a Django design issue not to rely on the database for cascade deletions to ensure compatibility among backends, which requires python-side processing of model relations.
This seems to be the case: the Collector class responsible of collecting Django models for deletion can only delete objects relying on the database backend
[...] if there are no cascades, no parents and no signal listeners for the object class. [...]
There should be no listeners for Translation object (there are on above models, to handle filesystem cleanup, but that should not matter in this case).
I think it has something to do with dependencies between models, I'll try to reproduce the issue on a simple Django app.
There actually is a post_delete listener that is run for every deleted Unit: https://github.com/nijel/weblate/blob/master/weblate/trans/models/__init__.py#L157
The code in this listener seems to match the queries from the (without cache-machine) stats, which explains the slow deletion process. It would be interesting to see if the database-heavy part of this code could be converted to an "AFTER DELETE" trigger. This would require hand-crafting some SQL code, but should drastically improve the peformance of this post_delete listener.
This would actually not remove the need for query-level caching, as import performance suffers from fetching the Source objects by their (checksum, subproject). As a source's (checksum, subproject) is not its primary key, cache-machine does not cache such requests, so this must be done manually.
diff --git a/weblate/trans/models/source.py b/weblate/trans/models/source.py
index b8a6737..e480e65 100644
--- a/weblate/trans/models/source.py
+++ b/weblate/trans/models/source.py
@@ -23,6 +23,8 @@ from django.utils.encoding import python_2_unicode_compatible
from django.utils.translation import ugettext_lazy as _
from weblate.trans.validators import validate_check_flags
+from caching.base import CachingManager, CachingMixin, cache, invalidator, DEFAULT_TIMEOUT
+
PRIORITY_CHOICES = (
(60, _('Very high')),
(80, _('High')),
@@ -31,9 +33,39 @@ PRIORITY_CHOICES = (
(140, _('Very low')),
)
+class SourceManager(CachingManager):
+ def get_by_checksum(self, checksum, subproject, create=True):
+ created = False
+ key = "src:%s:%s" % (checksum, subproject.id)
+
+ # Try to get the source from cache
+ val = cache.get(key)
+ if val is None:
+ # Get or create the value
+ if create:
+ val, created = self.get_or_create(
+ checksum=checksum,
+ subproject_id=subproject.id
+ )
+ else:
+ val = self.get(
+ checksum=checksum,
+ subproject_id=subproject.id
+ )
+ # Add to the cache
+ cache.set(key, val, DEFAULT_TIMEOUT)
+ # Setup flush list for source key
+ invalidator.add_to_flush_list(
+ {val.flush_key(): [key]}
+ )
+
+ if create:
+ return val, created
+ return val
+
@python_2_unicode_compatible
-class Source(models.Model):
+class Source(models.Model, CachingMixin):
checksum = models.CharField(max_length=40)
subproject = models.ForeignKey('SubProject')
timestamp = models.DateTimeField(auto_now_add=True)
@@ -47,6 +79,8 @@ class Source(models.Model):
blank=True,
)
+ objects = SourceManager()
+
class Meta(object):
permissions = (
('edit_priority', "Can edit priority"),
diff --git a/weblate/trans/models/unit.py b/weblate/trans/models/unit.py
index 2fe6c16..91c4746 100644
--- a/weblate/trans/models/unit.py
+++ b/weblate/trans/models/unit.py
@@ -68,7 +68,6 @@ def more_like_queue(pk, source, top, queue):
result = more_like(pk, source, top)
queue.put(result)
-
class UnitManager(models.Manager):
# pylint: disable=W0232
@@ -461,10 +460,8 @@ class Unit(models.Model, LoggerMixin):
return
# Ensure we track source string
- source_info, source_created = Source.objects.get_or_create(
- checksum=self.checksum,
- subproject=self.translation.subproject
- )
+ source_info, source_created = \
+ Source.objects.get_by_checksum(self.checksum, self.translation.subproject)
contentsum_changed = self.contentsum != contentsum
# Store updated values
@@ -1049,10 +1046,10 @@ class Unit(models.Model, LoggerMixin):
Returns related source string object.
"""
if self._source_info is None:
- self._source_info = Source.objects.get(
- checksum=self.checksum,
- subproject=self.translation.subproject
- )
+ self._source_info = Source.objects.get_by_checksum(
+ self.checksum,
+ self.translation.subproject,
+ create=False)
return self._source_info
def get_secondary_units(self, user):
Using cache-machine provides support for flush-lists, so cached entries are properly invalidated when their content change or they are deleted from the database.
Can you please submit the change as pull request?
Also, you're right, cleanup_deleted is the bottleneck in removal. It is really not needed in most cases, so maybe it would be better to keep these objects in database and do the cleanup job in background.
As noted in the PR #1173 django-cache-machine has issues with recent versions of Django (1.7+), which requires the mentioned fix.
I guess part of the work done by the cleanup_delete method can be replaced by running updatechecks
and cleanuptrans
after bulk operations such as component import or delete.
Context
I have encountered performance issues when dealing with a project containing 35k strings in 13 translations. The project is using monolingual PO files.
Actual behaviour
The import process took around 25 minutes (2min per translation) to complete. Deleting the newly imported component was in the 70-75 minutes range.
Performance analysis
During the import process, the python process was using more than 95% CPU time, while the postgres process used the remaining 5%. Latency of disk I/O operations did not excess 10ms, so storage didn't seem to be the bottleneck. The same behavior was observed when deleting the component.
Running PostgreSQL with the pg_stat_statements extension enabled revealed that when deleting the project, the same 3 queries to fetch project, language and component properties were run multiple times per translation unit being deleted. See project_delete_request_stats.csv.txt
We can also guess that the 4th request to fetch translation info is called for every deleted unit, which should not be necessary.
Possible fix
Adding query-level caching through django-cache-machine improved performance by 50% when deleting a project (between 30 and 35 minutes). Although django-cache-machine provides automatic cache invalidation when updating models, more in-depth testing is required to ensure this doesn't break anything.
Adding django-cache-machine to Weblate can be done according to the docs:
Note that some pages that uses QuerySet's "values" which is not yet supported by django-cache-machine fail to load. A temporary fix to the cache-machine is as follows:
Server configuration
The testing server is a 4-CPU 2.6GHz, with 6GB RAM setup with:
Output of
./manage.py list_versions
: