pulsejet / memories

Fast, modern and advanced photo management suite. Runs as a Nextcloud app.
https://memories.gallery
GNU Affero General Public License v3.0
2.96k stars 77 forks source link

Transcoded videos are bigger than original #967

Open xsiviso opened 6 months ago

xsiviso commented 6 months ago

Describe the bug I recognized that streamed file is larger than the original file. I streamed a video that is 1,00 GB large and shot in 4K. The streamed version was in 1440p and the network traffic stats stated that the 1,27 GB were loaded.

To Reproduce Open a video and monitor the network activity Screenshots

Platform:

Additional context Quality was set at automatic in the player. Quality factor in the transcode setting is set to standard 25. I use the external transcoder function with Nextcloud AIO community container and the VA-API accelerator (Which, by the way, runs wonderfully and was easy to set up - thanks for this!) I saw that the transcoded files use H.264 as codec. H.265 would be better in terms of file size. So maybe the solution ist to change the standard quality factor or implement switch between H.264/H.265 in the settings of the app.

pulsejet commented 6 months ago

Yes, transcoded videos are always larger than the originals. Some reasons for this:

  1. Transcoding uses a faster encoding preset so it can happen on the fly. Your original may, for instance, may use two-pass encoding which is impossible for transcoding.
  2. There are extra keyframes that need to be added for splitting the video into chunks. This adds significant cost.
  3. As you correctly noted, only H.264 can be used. This is by design since H.264 is the only codec supported by all major browsers (see H.265 support). I believe there's also VP9 but don't think the current hardware is fast enough to live transcode (same for AV1)

The bottom line is, reducing the video size is a non-goal for live transcoding; the goals instead are

  1. Adaptive streaming, i.e. reduced bitrate for mobile / remote streams (with lower resolution, e.g. 720p)
  2. Seeking over the network, which is not possible for all formats directly.
  3. Making sure videos play on all devices to begin with (H.265 don't, for instance)

Still, if you feel that the output sizes are too large, I suggest playing with the quality factor in the Memories admin panel. The default can be too high (I use ~30); a larger quality factor should result in significantly smaller output sizes.

I use the external transcoder function with Nextcloud AIO community container and the VA-API accelerator (Which, by the way, runs wonderfully and was easy to set up - thanks for this!)

Thanks go to @szaimen for this one 😄

EDIT: Hopefully one day we'll have an option for non-live transcoding too, where videos are pre-transcoded to different resolutions. This would in turn also be able to utilize slower presets and two-pass to further reduce output size. (of course, the cost is a lot more usage of storage, which is why I chose the path of live transcoding for the initial implementation_)

major-mayer commented 6 months ago

3. I believe there's also VP9 but don't think the current hardware is fast enough to live transcode (same for AV1)

VP9 and AV1 may be too expensive when doing software transcoding, but all modern GPUs (integrated or external) support HW-accelerated VP9 encoding, which should be fast enough. The newer Intel i/eGPUs even support AV1 HW-accelerated encoding. Jellyfin provides a pretty good overview of what is possible with modern hardware: https://jellyfin.org/docs/general/administration/hardware-acceleration/

pulsejet commented 6 months ago

The ideal solution I can think of is to allow multiple codec to be served (that the admin decides), and play the best one the client supports. Unfortunately there's no API I know of to ask the browser what codecs are supported, so there needs to be a mechanism to detect that (read complexity).

This may also require switching to DASH, since HLS only supports H.264 "officially" (HLS w/ mp4 might work as a hack).

There's also a separate consideration of bandwidth vs compute: network resources can often be much cheaper (e.g. since Nextcloud is self hosted, users may be in close proximity / on the same local network) compared to compute. Even if AV1 / VP9 can be encoded in hardware, the number of simulataneous streams might be higher for H.264 (just a guess).

pulsejet commented 6 months ago

Related #784

major-mayer commented 6 months ago

The ideal solution I can think of is to allow multiple codec to be served (that the admin decides), and play the best one the client supports. Unfortunately there's no API I know of to ask the browser what codecs are supported, so there needs to be a mechanism to detect that (read complexity).

How about this API? https://developer.mozilla.org/en-US/docs/Web/API/HTMLMediaElement/canPlayType Even tho, I'm surprised to read that the browser cannot certainly tell you if it can actually play a codec until it tries so.

There's also a separate consideration of bandwidth vs compute: network resources can often be much cheaper (e.g. since Nextcloud is self hosted, users may be in close proximity / on the same local network) compared to compute. Even if AV1 / VP9 can be encoded in hardware, the number of simulataneous streams might be higher for H.264 (just a guess).

Makes totally sense in an environment where the client is in the same network and computing power is limited. However, if there is enough computing power available/ not too many clients simultaneously and the bandwidth is very limited, using a format like AV1 or VP9 would be very beneficial. Nevertheless, I definitely see the complexity that this introduces.

pulsejet commented 6 months ago

Didn't know canPlayType has a optional codecs parameter. That should help; would also simplify #784 a lot. I'll say that'd be the first step towards this.

xsiviso commented 6 months ago

Thanks for the explanation! I'm currently using 30 and think it's a good compromise between quality and file size. For me the network bandwidth is important because when I'm off the network the upload speed is limited to 40mbit/s. On the other hand, I am using an Intel N100 and this CPU can easily handle H.265 4k video transcoding. I would not recommend the use of AV1 or VP9 because most chips only have hardware acceleration for decoding, not for encoding. So the server has to use the CPU. In the end, it's always the same with video: A battle between file size, speed and quality.

pulsejet commented 6 months ago

AV1 or VP9 because most chips only have hardware acceleration for decoding, not for encoding

It will, of course, have to be optional since only the newest chips support this.

using an Intel N100 and this CPU can easily handle H.265 4k video transcoding

That's great to hear (raises my hopes of having usable AV1 on low-end chips in a few years). H.265 has the worst support of the lot, but it might still be possible to detect browser support and use it when supported. After codec detection is implemented, it would mostly come down to whether HLS can use other codecs in the current implementation (if not, this could require a major rewrite).

craiq commented 6 months ago

one way to reduce compute load for encoding on demand would be #232