wagtail / wagtail-localize-git

Translate Wagtail content using a git repository and PO files
Other
6 stars 3 forks source link

sync_git crashes when a commit is pushed to the repo right after it gets pulled by wagtail-localize #19

Closed TheoChevalier closed 2 years ago

TheoChevalier commented 3 years ago

When Pontoon commits to the repository while sync_git is running (after the initial pull, and before the push), it prevents wagtail-localize from committing changes to the repository because the head commit is now different. The next time sync_git runs, this happens:

INFO - Pulling repository
Traceback (most recent call last):
  File "network-api/manage.py", line 34, in <module>
    execute_from_command_line(sys.argv)
  File "/app/.heroku/python/lib/python3.7/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line
    utility.execute()
  File "/app/.heroku/python/lib/python3.7/site-packages/django/core/management/__init__.py", line 395, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/app/.heroku/python/lib/python3.7/site-packages/django/core/management/base.py", line 330, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/app/.heroku/python/lib/python3.7/site-packages/django/core/management/base.py", line 371, in execute
    output = self.handle(*args, **options)
  File "/app/.heroku/python/lib/python3.7/site-packages/wagtail_localize_git/management/commands/sync_git.py", line 19, in handle
    SyncManager(logger=logger).sync()
  File "/app/.heroku/python/lib/python3.7/site-packages/wagtail_localize_git/sync.py", line 165, in sync
    _pull(repo, self.logger)
  File "/app/.heroku/python/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "/app/.heroku/python/lib/python3.7/site-packages/wagtail_localize_git/sync.py", line 34, in _pull
    last_commit_id, current_commit_id
  File "/app/.heroku/python/lib/python3.7/site-packages/wagtail_localize_git/git.py", line 50, in get_changed_files
    new_commit, old_commit
_pygit2.GitError: object not found - no match for id (b2c0f73ae4e6dd4ec4d247225bb243b1d7b2a159)
Sentry is attempting to send 0 pending error messages
Waiting up to 2 seconds
Press Ctrl-C to quit

I’ve hit the issue 3 times, so it seems to be reproducible

The current workaround is to run DELETE FROM wagtail_localize_git_synclogresource; DELETE FROM wagtail_localize_git_synclog;

phildexter commented 3 years ago

Notes:

Solution: try to find common commit that can be used Solution: put the whole thing in a transaction

zerolab commented 2 years ago

A couple of notes during dev

  1. _pull/_push already happen in a transaction
  2. the exception is thrown from https://github.com/wagtail/wagtail-localize-git/blob/v0.10.0/wagtail_localize_git/git.py#L49-L51 (using the version closes to the time of the report to match the line numbers in the stacktrace) which checks the old commit (taken from the log) vs the current commit (the repository HEAD commit revision)
  3. we do have a git pull happening before _pull/_push - https://github.com/wagtail/wagtail-localize-git/blob/main/wagtail_localize_git/sync.py#L179-L181 which is what updates the HEAD reference. we cannot wrap the repository pull in a transaction as that is happening on disk rather than db

I dug into how Pontoon works with VCS and git. It does the expected commit and push, rather than anything more complex like a rebase and force push (which would rewrite references). Given that https://github.com/wagtail/wagtail-localize-git/blob/main/wagtail_localize_git/git.py#L54-L57 should, at most, raise a value error that the commits are different.

@TheoChevalier is it at all possible there was a forced push to the Pontoo repository (always worth checking)

zerolab commented 2 years ago

The path I was going previously turned out to be erroneous. The exception is a side effect.

Scenario (easiest is to set a breakpoint on https://github.com/wagtail/wagtail-localize-git/blob/main/wagtail_localize_git/sync.py#L141)

I think the best way to account for this is to stop the push (but log the error) then wait for the next sync (either via cron or forced) to pick up the changes from the remote, then push new changes

How does that sound @TheoChevalier ?

TheoChevalier commented 2 years ago

Thanks a lot for digging that nasty one, @zerolab! Really not the easiest to reproduce/track down

And yup, your approach seems good to me. We could also probably re-trigger the sync right away (I think that’s what Pontoon does when it detects a change in the repo after it pulled) but just stopping it would work

zerolab commented 2 years ago

@TheoChevalier this was code reviewed and got it merged to main.

Will tweak some of the setup options and make a new release. Note that when upgrading to it, the default branch will be main, but that can be changed to using the WAGTAILLOCALIZE_GIT_DEFAULT_BRANCH setting