uktrade / mobius3

Continuously sync folder to S3, using inotify under the hood
MIT License
50 stars 3 forks source link

Creating a new file raises a KeyError #35

Closed yamrzou closed 4 years ago

yamrzou commented 4 years ago

If I run mobius3 with an existing bucket and then go to the synced folder and run touch foo, the file is uploaded correctly, but raises this error :

s3sync:event,f7997d07] Exception during <function Syncer.<locals>.schedule_upload_meta.<locals>.function at 0x7ff8750350e0>
Traceback (most recent call last):
  File "/home/yamrzou/.pyenv/versions/mainenv/lib64/python3.7/site-packages/mobius3.py", line 755, in process_jobs
    await job()
  File "/home/yamrzou/.pyenv/versions/mainenv/lib64/python3.7/site-packages/mobius3.py", line 678, in function
    await upload_meta(logger, path, version_current, version_original)
  File "/home/yamrzou/.pyenv/versions/mainenv/lib64/python3.7/site-packages/mobius3.py", line 884, in upload_meta
    on_done=set_meta,
  File "/home/yamrzou/.pyenv/versions/mainenv/lib64/python3.7/site-packages/mobius3.py", line 943, in locked_request
    if not cont():
  File "/home/yamrzou/.pyenv/versions/mainenv/lib64/python3.7/site-packages/mobius3.py", line 878, in <lambda>
    cont=lambda: meta[path] != data,
KeyError: PurePosixPath('/local/folder/foo')

I suppose it's due to the fact that set_meta is only called [on_done], but I can't fix it as it's not clear to me what cont does.

michalc commented 4 years ago

Hi @yamrzou!

Thanks for the issue report + PR.

However, I'm unable to reproduce the issue. For example, if I create a test:

@async_test
async def test_touch(self):
    delete_dir = create_directory('/s3-home-folder')
    self.add_async_cleanup(delete_dir)
    delete_bucket_dir = create_directory('/test-data/my-bucket')
    self.add_async_cleanup(delete_bucket_dir)

    import logging

    stdout_handler = logging.StreamHandler(sys.stdout)
    stdout_handler.setLevel('DEBUG')
    logger = logging.getLogger('mobius3')
    logger.setLevel('DEBUG')
    logger.addHandler(stdout_handler)

    start, stop = syncer_for('/s3-home-folder')
    self.add_async_cleanup(stop)
    await start()

    filename = str(uuid.uuid4())
    os.system(f'touch /s3-home-folder/{filename}')

    await await_upload()

    request, close = get_docker_link_and_minio_compatible_http_pool()
    self.add_async_cleanup(close)

    self.assertEqual(await object_body(request, filename), b'')

In the log output I don't see the exception.

[s3sync:start] Excluding: re.compile('^$')
[s3sync:start] Listing keys
[s3sync:start] Finished starting
[s3sync:download] Listing keys
[s3sync:event,37a78538] Path: /s3-home-folder/046ff075-432d-4bfa-b082-0c1fd592bfda
[s3sync:event,37a78538] Handler: handle__file__IN_CREATE
[s3sync:event,d25538e0] Path: /s3-home-folder/046ff075-432d-4bfa-b082-0c1fd592bfda
[s3sync:event,d25538e0] Handler: handle__file__IN_CLOSE_WRITE
[s3sync:event,d25538e0] Uploading /s3-home-folder/046ff075-432d-4bfa-b082-0c1fd592bfda
[s3sync:event,d25538e0] Creating flush file: /s3-home-folder/.__mobius3_flush__0864bdd34ec7456998ede773f2845363
[s3sync:event,2b7c1075] Path: /s3-home-folder/.__mobius3_flush__0864bdd34ec7456998ede773f2845363
[s3sync:event,2b7c1075] Handler: handle__flush__IN_CREATE
[s3sync:event,2b7c1075] Flushing: /s3-home-folder/.__mobius3_flush__0864bdd34ec7456998ede773f2845363
[s3sync:event,d25538e0] PUT https://minio:9000/my-bucket/046ff075-432d-4bfa-b082-0c1fd592bfda ((b'content-length', b'0'), (b'x-amz-meta-mtime', b'1586697177.824258'), (b'x-amz-meta-mode', b'33188'))
[s3sync:event,d25538e0] b'200' ((b'accept-ranges', b'bytes'), (b'content-length', b'0'), (b'content-security-policy', b'block-all-mixed-content'), (b'etag', b'"8bc23741b0a857c9c8c9e2361f67f812-1"'), (b'server', b'MinIO/RELEASE.2019-12-30T05-45-39Z'), (b'vary', b'Origin'), (b'x-amz-bucket-region', b'us-east-1'), (b'x-amz-request-id', b'160514AD336973A4'), (b'x-xss-protection', b'1; mode=block'), (b'date', b'Sun, 12 Apr 2020 13:12:57 GMT'))
[s3sync:stop] Stopping
[s3sync:stop] Finished stopping

I realise it surprises me that touch on a brand new file would cause a metadata upload, since metadata is uploaded on the IN_ATTRIB event, which I didn't think is triggered on a touch of a new file, so I suspect something is going on that I don't quite understand/have accounted for.

Do you have any details of the system you're running this on?

I suppose it's due to the fact that set_meta is only called [on_done], but I can't fix it as it's not clear to me what cont does.

Also cont is short for continue, and should be a function that returns a boolean that controls whether or not the upload should continue.

michalc commented 4 years ago

Will probably close this issue due to inactivity soon.