Open bugzilla-to-github opened 3 years ago
Comment Author: @flodolo
Looks like we're hitting this bug again, with sync stuck for Firefox.
May 23 13:35:47 mozilla-pontoon app/worker.1 psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "entity_locale_active"
May 23 13:35:47 mozilla-pontoon app/worker.1 DETAIL: Key (entity_id, locale_id, active)=(69308, 317, t) already exists.
Comment Author: @flodolo
It doesn't look like the string wasn't touched in the past 4 years https://pontoon.mozilla.org/ia/firefox/dom/chrome/accessibility/AccessFu.properties/?search=toolbar&string=69308
It's worth noting that this happened twice so far, and always with Interlingua (locale id 317). We had issues with Interlingua a while ago, when we migrated from BitBucket to hg.m.o, so maybe we're still seeing issues caused by that https://bugzilla.mozilla.org/show_bug.cgi?id=1409962
Comment Author: @mathjazz
There were dozens of duplicate translations in the ia Firefox localization, which I've identified with the help of the code below and removed.
translations = Translation.objects.filter(
entity__resource__project__slug="firefox",
locale__code="ia",
)
unique_entity_string_combination = (
translations
.values("entity", "string")
.annotate(count=Count('entity'))
)
for t in unique_entity_string:
# duplicate translations
if t["count"] > 1:
print(t)
Comment Author: @mathjazz
Let's keep the bug open to investigate if we have further cases of duplicate translations, where do they originate from and how do we prevent sync from crashing.
The issue has become more common recently with the introduction of Pretranslation: https://pontoon.mozilla.org/es-AR/mozillaorg/all-resources/?string=282842
We have more errors
Nov 09 08:26:27 mozilla-pontoon app/web.1 django.db.utils.IntegrityError: duplicate key value violates unique constraint "entity_locale_active"
Nov 13 15:03:13 mozilla-pontoon app/web.3 django.db.utils.IntegrityError: duplicate key value violates unique constraint "entity_locale_active"
Nov 18 05:21:24 mozilla-pontoon app/worker.1 django.db.utils.IntegrityError: duplicate key value violates unique constraint "entity_locale_active"
(13 more at this point)
Nov 18 07:41:09 mozilla-pontoon app/worker.1 django.db.utils.IntegrityError: duplicate key value violates unique constraint "entity_locale_active"
The issue has become more common recently with the introduction of Pretranslation: https://pontoon.mozilla.org/es-AR/mozillaorg/all-resources/?string=282842
In the last batch of new strings added to Mozilla.org, this type of errors only appeared in es-AR and only in pretranslations that contain Fluent message references and variables. It turns out that placeables in these pretranslations were not stored in a canonical form, i.e. {example}
instead of { example }
(mind the whitespace).
In the exported files in the repository, the strings appeared as they should, which resulted in sync trying to import pretranslations, becuse they weren't the same as in Pontoon DB. That also triggered this issue. In the UI (editor, string list) the strings were also rendered properly, which only made the problem more difficult to spot.
As the first step, we should make sure that re-serializing pretranslations always stores them in the canonical form. For the record, this is the code in question, which (when used from the command line) actually solved the immediate sync issue: https://github.com/mozilla/pontoon/blob/73c05e4777bf2e78058743df86a02b0e30f54e57/pontoon/pretranslation/pretranslate.py#L59
Error could also be prevented by fixing #2120.
Finally, since the reason the problem appears only in es-AR could be that we have used TM entries with bad placeables formatting to train the MT engine, we should also look in that direction.
As the general fix for this very issue, we need to make sure translations actually get deactivated before new active translations are set: https://github.com/mozilla/pontoon/blob/73c05e4777bf2e78058743df86a02b0e30f54e57/pontoon/base/models.py#L2613-L2619
My suspicion is this:
on_commit()
, which is also a wild guess at this pointappears only in es-AR could be that we have used TM entries with bad placeables formatting to train the MT engine
Given the amount of linters we have in place, I don't think these would exist in the "active" translations (strings currently in repositories). But, from my experience, translation memory has a lot more that doesn't show up in Pontoon, so it's still a possibility.
EDIT: just realized that, based on your comment on {$foo}
vs { $foo }
, that wouldn't be an error in any of the linters, since for Fluent they're both valid and equivalent.
This issue resurfaced in Hungarian Mozilla VPN Client after https://github.com/mozilla-l10n/mozilla-vpn-client-l10n/commit/2ab98c299c10eda1d32a041acfd28f76fc08b7dd landed.
We hit:
IntegrityError('duplicate key value violates unique constraint "entity_locale_active"
DETAIL: Key (entity_id, locale_id, active)=(304357, 100, t) already exists.\n')
...
IntegrityError('duplicate key value violates unique constraint "entity_locale_active"
DETAIL: Key (entity_id, locale_id, active)=(304326, 100, t) already exists.\n')
...
IntegrityError('duplicate key value violates unique constraint "entity_locale_active"
DETAIL: Key (entity_id, locale_id, active)=(304322, 100, t) already exists.\n')
Approving failing Pretranslations didn't help. I had to delete all of them, sync again, and then run pretranslation.
This issue was created automatically by a script.
Bug 1703666
Bug Reporter: @mathjazz CC: @flodolo
An IntegrityError can appear during sync, which prevents the sync job to complete.
The only way to resolve the issue it to manually remove the conflicting active translation from Pontoon (by either deleting, rejecting or unapproving it).