matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.83k stars 2.12k forks source link

Media upload fails with `ValueError: seek of closed file` #6192

Open jonaharagon opened 5 years ago

jonaharagon commented 5 years ago

Description

Media uploads are failing.

Steps to reproduce

Uploading any file in Riot (any platform) results in the upload failing.

image
2019-10-10 14:57:47,521 - synapse.http.server - 109 - ERROR - POST-329537- Failed handle request via 'UploadResource': <XForwardedForRequest at 0x7f1fb281c240 method='POST' uri='/_matrix/media/v1/upload?filename=ima_424b122.png' clientproto='HTTP/1.0' site=8008>
Traceback (most recent call last):
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/synapse/http/server.py", line 77, in wrapped_request_handler
    await h(self, request)
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/synapse/rest/media/v1/upload_resource.py", line 82, in _async_render_POST
    media_type, upload_name, request.content, content_length, requester.user
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/synapse/rest/media/v1/media_repository.py", line 160, in create_content
    fname = yield self.media_storage.store_file(content, file_info)
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/synapse/rest/media/v1/media_storage.py", line 68, in store_file
    self.hs.get_reactor(), _write_file_synchronously, source, f
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/twisted/python/threadpool.py", line 250, in inContext
    result = inContext.theWork()
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/twisted/python/threadpool.py", line 266, in <lambda>
    inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/synapse/logging/context.py", line 716, in g
    return f(*args, **kwargs)
  File "/opt/venvs/matrix-synapse/lib/python3.6/site-packages/synapse/rest/media/v1/media_storage.py", line 244, in _write_file_synchronously
    source.seek(0)  # Ensure we read from the start of the file
ValueError: seek of closed file
2019-10-10 14:57:47,578 - synapse.http.server - 458 - WARNING - POST-329537- Not sending response to request <XForwardedForRequest at 0x7f1fb281c240 method='POST' uri='/_matrix/media/v1/upload?filename=ima_424b122.png' clientproto='HTTP/1.0' site=8008>, already disconnected.

I also see similar errors ending in ValueError: I/O operation on closed file. on some uploads. I can upload the homeserver.log if needed.

Version information

If not matrix.org:

Jukong commented 5 years ago

I'm also running into this problem.

Also running Ubuntu 18.04 and behind an nginx reverse proxy inside a VM (Hyper-V).

Has anyone been able to get more information on this?

Jukong commented 5 years ago

I found the fix.

Uncomment the max upload line in the homeserver.yaml settings and change the limit to something like 25M or whatever you'd like.

Go to the reverse proxy and open the config file for your site (/etc/nginx/sites-available) add client_max_body_size 25M; parameter somewhere in the server block. Can make the size whatever you'd like.

Then lastly, add the same parameter for the nginx config on the server that is hosting matrix-synapse.

richvdh commented 5 years ago

it's expected that an upload will fail if you try to upload a file larger than the limit. But it sounds like the error message could be clearer, and synapse shouldn't spit exceptions into its logs.

richvdh commented 5 years ago

keywords: max_upload_size

Related: #1610

jonaharagon commented 5 years ago

This doesn't appear to be the issue in my configuration.

aaronraimist commented 5 years ago

@JonahAragon well something is different because I get a nice error message.

Screen Shot 2019-12-05 at 8 16 47 PM

What is your max_upload_size set to and how big is the file?

retrotails commented 4 years ago

I'm having the same issue, with very similar errors to the OP, both in the server logs and the client. It happens to me for files at around 400MB+, but uploading the same exact file sometimes succeeds, sometimes fails. I'm not using a reverse proxy, files of any type fail, and I get the same "Upload Failed" error as OP.

squahtx commented 3 years ago

The error seen appears to happen as a result of a client (or proxy) closing an upload connection after sending all the data but before Synapse has responded. It looks like Twisted does some buffering, so turning off request buffering in the reverse proxy doesn't change the timings.

Steps to reproduce the error

  1. Put a breakpoint in UploadResource._async_render_POST
  2. curl -v -X POST -d some_test_data https://your.homeserver.org/_matrix/media/r0/upload -H "Authorization: Bearer syt_your_access_token_here"
  3. Wait for the breakpoint to be hit
  4. Abort curl with Ctrl+C
  5. Continue execution from the breakpoint
  6. See ValueError: I/O operation on closed file.

Perhaps one cause is Synapse taking too long to process the upload and clients timing out?

squahtx commented 3 years ago

Synapse may take a long time to process the upload if all threads in the default thread pool are busy (see https://github.com/matrix-org/synapse/issues/11049). Giving the media repo its own thread pool might help with some of these failures.

Can people confirm whether they were running Synapse in single process or worker mode when they saw these failures?

retrotails commented 3 years ago

@squahtx single process for me. I didn't know workers were a thing, my homeserver has 3 users. if CPU usage is of any value, I'll note I've never seen above 2%. my load averages are all 0.00.
I'm not experiencing the same errors as above anymore. I still can't upload large files, but synapse doesn't log any errors. element android just tries uploading over and over again, and gomuks prints the following:

Failed to upload media: request error: Post "https://(...)/_matrix/media/r0/upload": context deadline exceeded (Client Timeout exceeded while awaiting headers)

I don't know whether my problem is still related to this issue.

squahtx commented 3 years ago

@retrotails Do you have a reverse proxy in front of Synapse, such as nginx? If you do, could you test the proxy_request_buffering off setting, or the equivalent for your reverse proxy? By default, nginx buffers the entire upload to disk before forwarding it on to Synapse, which buffers the entire upload to disk again. During the latter part, it would look like nothing is happening and clients might time out the upload if Synapse's I/O is slow enough.

squahtx commented 3 years ago

1.46.0rc1 (to be released within the next day) may reduce the frequency of upload failures thanks to the fix for #11049.

The race condition still needs to be fixed.

retrotails commented 3 years ago

@squahtx when I made my comment 20 months ago and had nearly an identical error to the OP, I was not using a reverse proxy. I am now using nginx, so I set proxy_request_buffering off, and I still have the same problem but synapse does log one thing when it fails now.
synapse.http.site - 323 - INFO - sentinel - Connection from IPv6Address(type='TCP', host='::1', port=40604, flowInfo=0, scopeID=0) lost before request headers were read
also I discovered the upload times out after exactly 10 minutes.

squahtx commented 3 years ago

@retrotails The timeout could be originating from anything in between your client and Synapse. From inspecting the code, Synapse ought to not time anything out while it's continually receiving data and I think the same applies to nginx.

Could you clarify which client(s) you're seeing the issue on, plus their versions? Are there any proxies in between your client and Synapse, apart from nginx?

retrotails commented 3 years ago

@squahtx no proxy aside from nginx, I have the issue with both element android (the latest on fdroid, reports as 1.3.3 [40103030] (F-651e3478)) and the latest version of gomuks. I also tried from a different client internet connection, same issue. the only thing I haven't ruled out is the internet connection the server is on, which I can't easily change atm.

KeenMaron commented 3 years ago

Any updates? I can send jpegs but a larger file (>10mb) fails with the same error message of op.

squahtx commented 3 years ago

I'm wholly unable to reproduce the issue. The error in the OP means that the client, reverse proxy, or anything in between aborted the request before Synapse received all the data and saved it. If you're having issues with a self-hosted instance of Synapse, it'd be extremely helpful to provide:

KeenMaron commented 3 years ago

Hey sorry. I fixed the problem with the solution of Jukong (Go to the reverse proxy and open the config file for your site (/etc/nginx/ ...) add client_max_body_size 25M;)

patrickulrich commented 3 years ago

I'm running into what I imagine is the same issue as well.

N0AGI commented 2 years ago

same issue - fails on uploading images. Image size was 114KB - therefore the homerserver.yaml setting to set max upload size was not the issue. I tried that one too.

Matrix is running on as a docker container. This was all working fine for ~3months.

any suggestions to a resolution ?

N0AGI commented 2 years ago

I gathered some additional log info. It appears to fail with 'Permission Denied' error at the time of creating media sub dirs.

Here's a snapshot of the log - taken realtime when the upload failed.

2022-03-20 23:30:11,365 - synapse.http.server - 100 - ERROR - POST-465 - Failed handle request via 'UploadResource': <XForwardedForRequest at 0x7fdef7252130 method='POST' uri='/_matrix/media/r0/upload' clientproto='HTTP/1.1' site='8008'>

Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/synapse/http/server.py", line 269, in _async_render_wrapper callback_return = await self._async_render(request)

File "/usr/local/lib/python3.9/site-packages/synapse/http/server.py", line 297, in _async_render callback_return = await raw_callback_return

File "/usr/local/lib/python3.9/site-packages/synapse/rest/media/v1/upload_resource.py", line 96, in _async_render_POST content_uri = await self.media_repo.create_content(

File "/usr/local/lib/python3.9/site-packages/synapse/rest/media/v1/media_repository.py", line 178, in create_content fname = await self.media_storage.store_file(content, file_info)

File "/usr/local/lib/python3.9/site-packages/synapse/rest/media/v1/media_storage.py", line 92, in store_file with self.store_into_file(file_info) as (f, fname, finish_cb):

File "/usr/local/lib/python3.9/contextlib.py", line 119, in enter return next(self.gen)

File "/usr/local/lib/python3.9/site-packages/synapse/rest/media/v1/media_storage.py", line 135, in store_into_file os.makedirs(dirname, exist_ok=True)

File "/usr/local/lib/python3.9/os.py", line 225, in makedirs mkdir(name, mode)

PermissionError: [Errno 13] Permission denied: '/data/media_store/local_content/IB/bt'

what are the expected dir level permissions ?

N0AGI commented 2 years ago

I was able to FIX the problem by awarding 777 to local_content and local_thumbnails directories.

that fixed the issue. For some reason these two folders had 755.

thought I'd share if this is of any help to others FWIW.

This is a great project BTW - thanks to the matrix / synapse creators and gate keepers

DMRobertson commented 2 years ago

@N0AGI I don't think this is the same issue---no ValueError appears in your logs. Beware that 777 permissions means that every user on the machine running Synapse can read and write to the media directories; generally speaking you want to restrict this to a single user or a group.

N0AGI commented 2 years ago

@DMRobertson - understood the concerns / implications around awarding 777. This was but an interim soln for me. definitely not the long term.

So, perhaps someone can point me to what exactly the issue is. The original dir permissions were 755 (I think). I set the synapse container to run under PUID and PGID of the docker server's uid/gid. I am not able to upload any media into synapse w/o setting the data/media_store/local_content permissions to 777. So, I am not sure what is causing the app to lean on rwx access to 'others'.

In the meantime, I was exploring the log further. What I found was an entry as follows: Please note the "{None}" id; typically, this would be "@user:" Any thoughts here ? ("" are intentionally masked)

synapse.access.http.8008 - 427 - INFO - GET-254 - - {None} Processed request: 0.004sec/-0.000sec (0.002sec, 0.000sec) (0.001sec/0.000sec/1) 149191B 200 "GET /_matrix/media/r0/download/matrix.n0agi.com/jKWHBYEEzGnPIabqRuvkCnSn HTTP/1.1" "Element/1.4.4 (Linux; U; Android 12; SM-T870 Build/SP1A.210812.01; Flavour GooglePlay; MatrixAndroidSdk2 1.4.4)" [0 dbevts]

I plan to rebuild the container from scratch with 0 historical data. But, wanted to explore all options before I went that route.

any pointers would be appreciated. I bet I am missing something - but, my brain might be thick to see... Thanks

richvdh commented 2 years ago

(have hidden comments unrelated to this issue. Please note we're unable to provide general user support via the issue tracker.)

00011100 commented 2 years ago

Any updates on this issue?

squahtx commented 2 years ago

For large files (>100MiB), we suspect #13009 to be the cause. We've not yet been able to track down the cause for small files.

Hiers commented 2 years ago

I was having this problem too. No file, no matter how big or small, could be uploaded. I eventually found the problem was in the homeserver.yaml "media_store_path" config. It was set to /var/lib/synapse/media_store, and I was running synapse as another user (not synapse) who didn't have write access to that directory. Either changing the directory to somewhere you have write permission or having synapse be run as the synapse user will fix it if this is your case.