tsudoko / anki-sync-server

Self-hosted Anki sync server
GNU Affero General Public License v3.0
491 stars 81 forks source link

AssertionError in _adopt_media_changes_from_zip #49

Open rsanden opened 4 years ago

rsanden commented 4 years ago

I am operating on 3b8fe9e7f346cfb749bb7b045212d1e4ff2410ed

I encountered an AssertionError while attempting to push some media changes from Anki (2.1.15 and 2.1.19) on Ubuntu. As part of that process, ankisyncd is being requested to delete some media files which do not currently exist in the collection.media directory. This triggers the AssertionError on assert self.col.media.lastUsn() == oldUsn + processed_count

To help debug, I added some print statements to the immediately-preceding code:

# We count all files we are to remove, even if we don't have them in
# our media directory and our db doesn't know about them.
processed_count = len(media_to_remove) + len(media_to_add)

assert len(meta) == processed_count  # sanity check

print(f'==== POINT A :: usn={usn}, oldUsn={oldUsn}, len(meta)==processed_count=={processed_count}', file=sys.stderr)
if media_to_remove:
    print(f'==== Removing media: {repr(media_to_remove)}', file=sys.stderr)
    self._remove_media_files(media_to_remove)

if media_to_add:
    print(f'==== Adding media: {repr(media_to_add)}', file=sys.stderr)
    self.col.media.db.executemany(
        "INSERT OR REPLACE INTO media VALUES (?,?,?)", media_to_add)
    self.col.media.db.commit()

print(f'==== POINT B :: lastUsn()={self.col.media.lastUsn()}, oldUsn={oldUsn}, processed_count={processed_count}', file=sys.stderr)
assert self.col.media.lastUsn() == oldUsn + processed_count  # TODO: move to some unit test
return processed_count

The output is as follows:

[2020-01-22 16:48:18,298]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Starting...
[2020-01-22 16:48:18,299]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Running meta(*[], **{'v': 9, 'cv': 'ankidesktop,2.1.15 (442df9d6),lin:ubuntu:18.04'})
[2020-01-22 16:48:18,312]:INFO:ankisyncd.http:127.0.0.1 "POST /sync/meta HTTP/1.0" 200 112
[2020-01-22 16:48:20,560]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Running operation_upload(*[b'SQLite format 3\x00\x10\x00\x01\x01\x00@  \x00\x00\x00\xc4\x00\x00\x00\x81\x00\x00\x00\x0f\x00\x00\x00\x01\x00\x00\x00\x18\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00...', <__main__.SyncUserSession object at 0x7ff9be98deb8>], **{})
[2020-01-22 16:48:20,573]:INFO:ankisyncd.http:127.0.0.1 "POST /sync/upload HTTP/1.0" 200 2
[2020-01-22 16:48:20,630]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Running begin(*[], **{'skey': '7ca4d999'})
[2020-01-22 16:48:20,631]:INFO:ankisyncd.http:127.0.0.1 "POST /msync/begin HTTP/1.0" 200 52
[2020-01-22 16:48:20,682]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Running mediaChanges(*[], **{'lastUsn': 1347})
[2020-01-22 16:48:20,683]:INFO:ankisyncd.http:127.0.0.1 "POST /msync/mediaChanges HTTP/1.0" 200 23
[2020-01-22 16:48:20,742]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Running uploadChanges(*[], **{'data': b'PK\x03\x04\x14\x00\x00\x00\x08\x00\n\x866P\xec[G\xf9\r\x01\x00\x00R\x03\x00\x00\x05\x00\x00\x00_metauR\xcbj\xc3@\x0c\xfc\x95\xe0s\x08\x92f\xa4]}K\x08:\x06\n\x85\xfc\xff\xa9\xb2K\xa1\xe0\xf5Y\xc3<\xf5|n&:T\xc5U\x14\x82R...'})
==== POINT A :: usn=1347, oldUsn=1347, len(meta)==processed_count==25
==== Removing media: ['20171105101303_14010.jpg', '20180302112534_42967.mp3', '20180301160604_88997.mp3', '20180302133019_27990.mp3', '20180416092356_40344.jpg', '20171030214910_26553.jpg', '20180302092239_46576.mp3', '20180227113121_59562.mp3', '20180302101241_19948.mp3', '20180227115011_64478.mp3', '20171110165954_33939.jpg', '20180604115213_14556.mp3', '20180227141910_86300.mp3', '20180302112109_54634.mp3', '20171104220557_42406.jpg', '20180227165103_57636.mp3', '20180301150503_63690.mp3', '20180301155325_64802.mp3', '20180227094943_28410.mp3', '20171121221705_63044.jpg', '20180227113240_97547.mp3', '20180227145848_50959.mp3', '20180227133215_10335.mp3', '20171105113357_96684.jpg', '20180301142836_52352.mp3']
==== POINT B :: lastUsn()=1347, oldUsn=1347, processed_count=25
[2020-01-22 16:48:20,746]:ERROR:ankisyncd.CollectionThread[yq-class-Z1]:Unable to uploadChanges(*[], **{'data': b"PK\x03\x04\x14\x00\x00\x00\x08\x00\n\x866P\xec[G\xf9\r\x01\x00\x00R\x03\x00\x00\x05\x00\x00\x00_metauR\xcbj\xc3@\x0c\xfc\x95\xe0s\x08\x92f\xa4]}K\x08:\x06\n\x85\xfc\xff\xa9\xb2K\xa1\xe0\xf5Y\xc3<\xf5|n&:T\xc5U\x14\x82R\x8a\xca\xe3\xeb\xf3\xde\xee\xb7m{\xddo\x07b\n\xc4T\xcd\xc1\xa2e\x8c\xc7\xf7\x07'\x84jH\x08k\xce\xcc5\xc2\x14\r\xcb\xb2\x91)\x0b\x04\x9b!\r\x1eE\x01y\xf21\xf4 a\xaa\x94\x85;\xd6N\x9b\xc3\x90\xc5\xf0\x11\x0b\x15\xb3N\x0c5-O\x0f\xbbp*j\xd4\xd2L\xce+\x0e\xef\xc8\x15\xe48#\xfa\xda\x0c\xd1\x02, \x91\x0b\xa7\xddUst'\xdd\xba\xfb\x95S\xea\x9ev\x06d\xd5\xd8\xef.*Y\xce\x00W>\x84f\xe2>z9J,|\xec*\xd1\x0f\x80\xea\xbe\xb0\xf2\xb1o\xeb\xe2\x8d\xe8\xfbr\xb9\x03\xe10\xef>\x9aq\x9dE\x92I\x94M\xea\x99c\xec9\xcctHs@\x16\xeb\xff-\xd71*\x87s\xf5cGc>9\xcb%=/\x10\x80\xa9W'\x86/\x1b\xf3VA7\x96\x11s\xe5cOK\x9b\x88\xf2\xfe\xd6\x7fi_?PK\x01\x02\x14\x03\x14\x00\x00\x00\x08\x00\n\x866P\xec[G\xf9\r\x01\x00\x00R\x03\x00\x00\x05\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x01\x00\x00\x00\x00_metaPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x003\x00\x00\x000\x01\x00\x00\x00\x00"}): 
Traceback (most recent call last):
  File "/home/anki/src/anki-sync-server/ankisyncd/thread.py", line 98, in _run
    ret = self.wrapper.execute(func, args, kw, return_queue)
  File "/home/anki/src/anki-sync-server/ankisyncd/collection.py", line 46, in execute
    ret = func(*args, **kw)
  File "ankisyncd/sync_app.py", line 622, in run_func
    res = handler_method(**keyword_args)
  File "ankisyncd/sync_app.py", line 197, in uploadChanges
    processed_count = self._adopt_media_changes_from_zip(z)
  File "ankisyncd/sync_app.py", line 272, in _adopt_media_changes_from_zip
    assert self.col.media.lastUsn() == oldUsn + processed_count  # TODO: move to some unit test
AssertionError
Traceback (most recent call last):
  File "/usr/lib/python3.6/wsgiref/handlers.py", line 137, in run
    self.result = application(self.environ, self.start_response)
  File "/home/anki/.local/lib/python3.6/site-packages/webob/dec.py", line 129, in __call__
    resp = self.call_func(req, *args, **kw)
  File "/home/anki/.local/lib/python3.6/site-packages/webob/dec.py", line 193, in call_func
    return self.func(req, *args, **kwargs)
  File "ankisyncd/sync_app.py", line 599, in __call__
    result = self._execute_handler_method_in_thread(url, data, session)
  File "ankisyncd/sync_app.py", line 631, in _execute_handler_method_in_thread
    result = thread.execute(run_func, kw=keyword_args)
  File "/home/anki/src/anki-sync-server/ankisyncd/thread.py", line 79, in execute
    raise ret
  File "/home/anki/src/anki-sync-server/ankisyncd/thread.py", line 98, in _run
    ret = self.wrapper.execute(func, args, kw, return_queue)
  File "/home/anki/src/anki-sync-server/ankisyncd/collection.py", line 46, in execute
    ret = func(*args, **kw)
  File "ankisyncd/sync_app.py", line 622, in run_func
    res = handler_method(**keyword_args)
  File "ankisyncd/sync_app.py", line 197, in uploadChanges
    processed_count = self._adopt_media_changes_from_zip(z)
  File "ankisyncd/sync_app.py", line 272, in _adopt_media_changes_from_zip
    assert self.col.media.lastUsn() == oldUsn + processed_count  # TODO: move to some unit test
AssertionError
[2020-01-22 16:48:20,752]:INFO:ankisyncd.http:127.0.0.1 "POST /msync/uploadChanges HTTP/1.0" 500 59

The actual collection.media directory contains exactly the 1347 files which it would be expected to contain after a successful sync. The media_to_remove (such as 20180301160604_88997.mp3) do exist under other users but not this user (yq-class-Z1).

Note that I've reproduced this every time that I try to sync, but the first time it failed there were not yet any debug print statements, so I don't have debug output from the first time it occurred.

Interestingly, I also see the same error when attempting to pull (force a download sync) from the server.

I am happy to help debug this if there's anything I can do to assist. anki-sync-server is a core piece of our infrastructure and I really appreciate your work on it.

rsanden commented 4 years ago

I reverted to c07fe0e65ccd0e13e6ebcacae8eb0521c9a16209 and the sync succeeds. I see many errors like this:

[2020-01-22 17:22:51,538]:ERROR:ankisyncd:Error when removing file '20180227134711_22634.mp3' from media dir: [Errno 2] No such file or directory: '/home/anki/src/anki-sync-server/collections/yq-class-Z1/collection.media/20180227134711_22634.mp3'
[2020-01-22 17:22:51,538]:ERROR:ankisyncd:Error when removing file '20180227150048_13153.mp3' from media dir: [Errno 2] No such file or directory: '/home/anki/src/anki-sync-server/collections/yq-class-Z1/collection.media/20180227150048_13153.mp3'
[2020-01-22 17:22:51,538]:ERROR:ankisyncd:Error when removing file '20180302093823_10153.mp3' from media dir: [Errno 2] No such file or directory: '/home/anki/src/anki-sync-server/collections/yq-class-Z1/collection.media/20180302093823_10153.mp3'

but everything works afterward.

(I would normally scrub log data, but there's nothing confidential here.)

rsanden commented 4 years ago

The successful sync on c07fe0e65ccd0e13e6ebcacae8eb0521c9a16209 causes the sync on 3b8fe9e7f346cfb749bb7b045212d1e4ff2410ed to succeed afterward:

[2020-01-22 17:45:51,816]:INFO:ankisyncd.http:127.0.0.1 "POST /msync/mediaChanges HTTP/1.0" 200 349
[2020-01-22 17:45:51,887]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Running mediaChanges(*[], **{'lastUsn': 1343})
[2020-01-22 17:45:51,888]:INFO:ankisyncd.http:127.0.0.1 "POST /msync/mediaChanges HTTP/1.0" 200 284
[2020-01-22 17:45:51,956]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Running mediaChanges(*[], **{'lastUsn': 1344})
[2020-01-22 17:45:51,957]:INFO:ankisyncd.http:127.0.0.1 "POST /msync/mediaChanges HTTP/1.0" 200 218
[2020-01-22 17:45:52,034]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Running mediaChanges(*[], **{'lastUsn': 1345})
[2020-01-22 17:45:52,035]:INFO:ankisyncd.http:127.0.0.1 "POST /msync/mediaChanges HTTP/1.0" 200 152
[2020-01-22 17:45:52,107]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Running mediaChanges(*[], **{'lastUsn': 1346})
[2020-01-22 17:45:52,108]:INFO:ankisyncd.http:127.0.0.1 "POST /msync/mediaChanges HTTP/1.0" 200 88
[2020-01-22 17:45:52,171]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Running mediaChanges(*[], **{'lastUsn': 1347})
[2020-01-22 17:45:52,172]:INFO:ankisyncd.http:127.0.0.1 "POST /msync/mediaChanges HTTP/1.0" 200 23
[2020-01-22 17:45:52,221]:INFO:ankisyncd.CollectionThread[yq-class-Z1]:Running mediaSanity(*[], **{'local': 1347})
[2020-01-22 17:45:52,222]:INFO:ankisyncd.http:127.0.0.1 "POST /msync/mediaSanity HTTP/1.0" 200 25
rsanden commented 4 years ago

I found out how to reproduce this.

  1. I added a Deck that uses 1347 media files
  2. I linked a superset of media files to this user's collections.media (ln /path/to/media/* $HOME/.local/share/Anki2/yq-class-Z1/collection.media/).
  3. I deleted unused media files with Tools -> Check Media -> Delete unused Files
  4. I performed a push sync to to the anki server ("Upload to AnkiWeb")

Those deleted files don't exist on the server, but the Anki client doesn't know that, so it requests that they be removed. I think the issue is that the server isn't ignoring those nonexistent files and instead treats their nonexistence as a problem.

Because I am able to reproduce the problem, I was able to step through and confirm that the problem was introduced in 10f47611bf829d9e081a1213f89bbf4b25b8d59a:

10f47611bf829d9e081a1213f89bbf4b25b8d59a : AssertionError ac1920d0786725b3e6a3f23b52cd986ca6b13c7b : OK

This issue is not urgent for me. I am running on c07fe0e65ccd0e13e6ebcacae8eb0521c9a16209 (2.1.0) in the meantime -- I just hope the bug report is useful.

genedan commented 4 years ago

@rsanden I just ran into this error as well. I tried to upload a collection to an AWS Lightsail instance. What's odd though is that I haven't removed extra media for a long time, and it wasn't until I uploaded my collection to a new server when the assertion error happened.

The move to https is making my AnkiDroid break so I'm setting up an https capable server (another writeup on that to come).

I checked my media and it seems to be uploading okay now, though.