mastodon / mastodon

Your self-hosted, globally interconnected microblogging community
https://joinmastodon.org
GNU Affero General Public License v3.0
46.59k stars 6.86k forks source link

Video files are needlessly re-encoded #24481

Open h-2 opened 1 year ago

h-2 commented 1 year ago

Steps to reproduce the problem

  1. Upload a video file in 3.1MB MP4 (h264+aac); exactly the same codecs that Mastodon currently uses to serve files, and below the 40MB limit.
  2. The file is re-encoded to a 3.2MB MP4 🤯
  3. Obviously, lossy re-encoding can only decrease quality.
  4. It also puts needless load on the server.

Expected behaviour

It should just use the file as is

Actual behaviour

It re-encodes the file.

Detailed description

There are many different problems around videos attached to posts. The quality is often bad, and it is unclear which formats, resolutions, bitrates lead to the video being re-encoded. It is also unclear why well-supported free formats (WebM) are replaced by patent-encumbered/proprietary formats (MP4).

I would humbly suggest the following:

  1. Clearly specify which Codecs/resolutions/bitrates the server uses to serve files.
  2. Accept anything with allowed codecs and within allowed resolutions/bitrates as-is (no re-encoding).
  3. Possibly allow instance-admins to tweak these settings.

Related issues:

Specifications

Mastodon 4.1.2 Firefox 112

jtracey commented 1 year ago

Re-encoding is usually a line of defense against mass-exploiting client-side rendering vulnerabilities. If there's a bug like stagefright in a bunch of clients, and you simply host the original file, a single user (themselves possibly a victim of the vulnerability) could get an exploit into federated timelines and wreak all sorts of havoc. While you could in theory scan for malformed files or known payloads, that's a lot of work to maintain a blocklist that will inevitably allow some bad things through (if they never failed, we would never have these vulnerabilities to begin with). Simply re-encoding on a server you already implicitly trust is a lot easier.

h-2 commented 1 year ago

Thank you for the quick reply. I hadn't considered malicious files at all 😐 But doesn't this also mean that re-encoding poses an attack surface on the server running the instance? Or are the ffmpeg-processes sandboxes in a special way? And are images are "re-encoded"?

Also, it seems that videos hosted on third party sites and included as links are also automatically embedded, or not? So does this measure really improve security?

Don't get me wrong, security is important. I just think that good quality pictures and videos are as well 😄