matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.79k stars 2.13k forks source link

`ValueError: seek of closed file` when client aborts media upload at exactly the right time #11154

Closed HarHarLinks closed 2 years ago

HarHarLinks commented 2 years ago

Describe the bug I upgraded synapse-s3-storage-provider together with synapse 1.45.1 using pip install --force-reinstall, but had to downgrade back to 1.43 (https://github.com/matrix-org/synapse/issues/11049#issuecomment-948545163). Now can't upload media, error log:

2021-10-21 12:17:51,904 - synapse.http.server - 93 - ERROR - POST-7488- Failed handle request via 'UploadResource': <XForwardedForRequest at 0x7f3bfbc62e20 method='POST' uri='/_matrix/media/r0/upload' clientproto='HTTP/1.0' site='8008'>
Traceback (most recent call last):
  File "site-packages/synapse/http/server.py", line 258, in _async_render_wrapper
    callback_return = await self._async_render(request)
  File "site-packages/synapse/http/server.py", line 286, in _async_render
    callback_return = await raw_callback_return
  File "site-packages/synapse/rest/media/v1/upload_resource.py", line 93, in _async_render_POST
    content_uri = await self.media_repo.create_content(
  File "site-packages/synapse/rest/media/v1/media_repository.py", line 169, in create_content
    fname = await self.media_storage.store_file(content, file_info)
  File "site-packages/synapse/rest/media/v1/media_storage.py", line 81, in store_file
    await self.write_to_file(source, f)
  File "site-packages/synapse/rest/media/v1/media_storage.py", line 88, in write_to_file
    await defer_to_thread(self.reactor, _write_file_synchronously, source, output)
  File "site-packages/twisted/python/threadpool.py", line 238, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "site-packages/twisted/python/threadpool.py", line 254, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "site-packages/twisted/python/context.py", line 118, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "site-packages/twisted/python/context.py", line 83, in callWithContext
    return func(*args, **kw)
  File "site-packages/synapse/logging/context.py", line 902, in g
    return f(*args, **kwargs)
  File "site-packages/synapse/rest/media/v1/media_storage.py", line 298, in _write_file_synchronously
    source.seek(0)  # Ensure we read from the start of the file
ValueError: seek of closed file

Is matrix-org/synapse-s3-storage-provider#62 not backwards compatible after all?

The following also happen:

Server:

Desktop (please complete the following information):

Additional context pip freeze

attrs==21.2.0
Automat==20.2.0
awscli==1.19.53
awscli-plugin-endpoint==0.4
bcrypt==3.2.0
bleach==3.3.0
boto3==1.17.53
botocore==1.20.53
canonicaljson==1.4.0
certifi==2020.12.5
cffi==1.14.5
chardet==4.0.0
colorama==0.4.3
constantly==15.1.0
cryptography==3.4.7
docutils==0.15.2
frozendict==1.2
humanize==3.12.0
hyperlink==21.0.0
idna==2.10
ijson==3.1.4
importlib-metadata==3.10.1
incremental==21.3.0
Jinja2==2.11.3
jmespath==0.10.0
jsonschema==3.2.0
lxml==4.6.3
MarkupSafe==1.1.1
matrix-synapse==1.43.0
msgpack==1.0.2
netaddr==0.8.0
packaging==20.9
phonenumbers==8.12.21
Pillow==8.2.0
prometheus-client==0.10.1
psycopg2==2.9.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.20
pymacaroons==0.13.0
PyNaCl==1.4.0
pyOpenSSL==20.0.1
pyparsing==2.4.7
pyrsistent==0.17.3
python-dateutil==2.8.2
PyYAML==5.4.1
requests==2.25.1
rsa==4.7.2
s3transfer==0.3.7
service-identity==18.1.0
signedjson==1.1.1
simplejson==3.17.2
six==1.16.0
sortedcontainers==2.3.0
synapse-s3-storage-provider==1.0
tqdm==4.62.3
treq==21.1.0
Twisted==21.7.0
typing-extensions==3.10.0.2
unpaddedbase64==2.1.0
urllib3==1.26.7
webencodings==0.5.1
zipp==3.4.1
zope.interface==5.4.

(maybe synapse-s3-storage-provider could use some versioning beyond 1.0?)

homeserver.yaml

media_storage_providers:
- module: s3_storage_provider.S3StorageProviderBackend
  store_local: True
  store_remote: False
  store_synchronous: False
  config:
    bucket: synapse
    # All of the below options are optional, for use with non-AWS S3-like
    # services, or to specify access tokens here instead of some external method.
    region_name: xx-xxx
    endpoint_url: https://s3.some.provider
    access_key_id: REDACTED
    secret_access_key: REDACTED
HarHarLinks commented 2 years ago

Okay so while trying to repro the error message element gives, the problem seems to have disappeared? Maybe server was just overloaded?

richvdh commented 2 years ago

sounds like a dup of https://github.com/matrix-org/synapse/issues/6192

HarHarLinks commented 2 years ago

Except, if I correlate these correctly, it happened uploading a 56.4KiB jpg to a server with max_upload_size: "50M"

squahtx commented 2 years ago

It's good to hear that things are working for you again! I suspect this is a bug in Synapse rather than the S3 storage provider. Perhaps the error is triggered when a client aborts a file upload midway through? I'm not sure how we'd reproduce that scenario with curl

squahtx commented 2 years ago

Finally reproduced it. The client needs to disconnect in the narrow time window in between Synapse receiving the end of the uploaded data and writing the file to disk.

I had nginx in front of Synapse, but telling it to avoid buffering uploads didn't make a difference. It looks like Twisted does some buffering too(?).

Steps to reproduce the error

  1. Put a breakpoint in UploadResource._async_render_POST
  2. curl -v -X POST -d some_test_data https://your.homeserver.org/_matrix/media/r0/upload -H "Authorization: Bearer syt_your_access_token_here"
  3. Wait for the breakpoint to be hit
  4. Abort curl with Ctrl+C
  5. Continue execution from the breakpoint
  6. See ValueError: I/O operation on closed file.

I can't see how this would result in what you saw though, since in this scenario it's the client intentionally aborting the upload. It's also a synapse bug, unrelated to the upgrade and the storage provider.

richvdh commented 2 years ago

I think we should treat this as a dup of #6192. An overlarge file (causing the reverse-proxy to drop the connection) is one potential cause, but clearly not the only cause, as evidenced by the comments on that issue.

squahtx commented 2 years ago

Closed as duplicate of #6192

squahtx commented 2 years ago

since in this scenario it's the client intentionally aborting the upload

Or Synapse is taking too long to process the upload and clients are timing out the request