Is it true we have to combine the video and audio files using ffmpeg or the python or JS ffmpeg port?

nonopolarity commented 1 year ago

Often we have to download the 1080 video and audio file separately as 2 files. Is it true we just have to use ffmpeg or the Python or JS port of ffmpeg to combine the 2 files into one .mp4 file? ytdl-core probably doesn't have this feature?

(example: https://zulko.github.io/moviepy/ https://github.com/ffmpegwasm/ffmpeg.wasm )

richardabear commented 1 year ago

Youtube separates the audio and video streams for higher resolution videos.

You will have to use ffmpeg to combine these streams, thankfully this repo has an example https://github.com/fent/node-ytdl-core/blob/master/example/ffmpeg.js

nonopolarity commented 1 year ago

does ffmpeg combine the video and audio like in a few seconds? I could also use Final Cut to combine them as it basically is a reencode and it takes a long time. VLC Player can also combine the video and audio and it takes only 1 or 2 seconds or just a few seconds even if the video length is an hour

richardabear commented 1 year ago

I think that will depend on your hardware and usecase.

I have found that the performance of ffmpeg is quite impressive when it comes to mixing just the 2 streams. Also it wouldnt be a "re encoding" technically, the way the documentation describes it. because you are just copying the stream from the video "-c:v copy" flag

nonopolarity commented 1 year ago

I am more concerned about, doing it this way using ffmpeg,

does it involve reencoding (usually takes quite long. For a 10 minute video, it will take 2 to 5 minutes), or
does it only involve putting the two files into one file (usually just copy two data chucks into one file and is super fast. For a 10 minute video, it will take 2 seconds).

Which one is it?

christiangenco commented 1 year ago

You only have to combine the video and audio files if you download video-only and audio-only streams.

If you don't care about downloading the absolute highest quality you can just download the highest quality stream that already contains audio and video with something like this:

const info = await ytdl.getInfo(url, {});
const format = ytdl.chooseFormat(formats, {
  filter: "audioandvideo",
  quality: "highest",
});
ytdl.downloadFromInfo(info, {
  quality: format.itag
})

nonopolarity commented 1 year ago

You only have to combine the video and audio files if you download video-only and audio-only streams.

If you don't care about downloading the absolute highest quality you can just download the highest quality stream that already contains audio and video with something like this

right. in the past it often means 360p, which is vastly different from 720 or 1080p

richardabear commented 1 year ago

I am more concerned about, doing it this way using ffmpeg,

does it involve reencoding (usually takes quite long. For a 10 minute video, it will take 2 to 5 minutes), or

does it only involve putting the two files into one file (usually just copy two data chucks into one file and is super fast. For a 10 minute video, it will take 2 seconds).

Which one is it?

That completely depends on your use case.

In my use case I just use the second option (copy) i dont reencode.

nonopolarity commented 1 year ago

I am more concerned about, doing it this way using ffmpeg,

does it involve reencoding (usually takes quite long. For a 10 minute video, it will take 2 to 5 minutes), or

does it only involve putting the two files into one file (usually just copy two data chucks into one file and is super fast. For a 10 minute video, it will take 2 seconds).

Which one is it?

the question is not about which one is it. The question is about how does ffmpeg do it and naturally, if a job can be done in 2 seconds, I don't want to spend 2 to 5 minutes to do it.

richardabear commented 1 year ago

pass the -c copy flag to the ffmpeg command and it wont reencode

christiangenco commented 1 year ago

-c:v copy and -c:a copy will only work if you're merging two compatible streams (or if you're merging them into an mkv wrapper that basically supports streams of any type).

If your video is encoded with h264 (.mp4) your audio needs to be encoded with aac to copy both streams into a new .mp4 without re-encoding.

If your video is encoded with vp8 or vp9 (.webm) your audio needs to be encoded with either opus or vorbis to copy both streams into a new .webm without re-encoding.

The technique the example ffmpeg.js script uses to merge audio and video is to always copy the audio codec and always re-encode the audio (it includes -c:v copy but doesn't specify the audio encoding which means ffmpeg will always re-encode the audio to a compatible format).

This isn't a terrible strategy because:

It will produce a playable video every time.
Re-encoding audio takes an order of magnitude less time than re-encoding video.
It's simple. You don't need a first pass of ffprobe to check that the streams are compatible.

You could make sure you never re-encode by selecting compatible video and audio streams at download time.

richardabear commented 1 year ago

To add more to @christiangenco 's answer in my experience or at least the way I understand it is, that youtube will take your input video (the video file you upload) and re-encode it in those exact formats (h264/h265) for videos and then aac for audio, therefore when using the ffmpeg method, you are able to just use copy encoding all the time (atleast in my experience)

christiangenco commented 1 year ago

Yup 👆

The trouble is that YouTube also re-encodes your video into webm and opus so often when I ask node-ytdl-core for bestaudio and bestvideo it gives me two incompatible formats.

Kinuseka commented 1 year ago

I recommend avoid using opus for audio and use the mp4a.40.2 if you are planning to use the .mp4 format

mp4 players usually does not support 48khz which opus uses.

luciano-repetti commented 10 months ago

How can I merge video and audio to output an mp4?

My code:

        const audioStream = ytdl(URL as string, {
          filter: 'audioonly',
          quality: 'highestaudio',
        });

        const videoStream = ytdl(URL as string, {
          filter: (format) => format.hasVideo && (format.container === 'mp4' || format.container === 'webm'),
          quality: qualityOption,
        });

micaelsgarcez commented 5 months ago

I am more concerned about, doing it this way using ffmpeg,

does it involve reencoding (usually takes quite long. For a 10 minute video, it will take 2 to 5 minutes), or

does it only involve putting the two files into one file (usually just copy two data chucks into one file and is super fast. For a 10 minute video, it will take 2 seconds).

Which one is it?

the question is not about which one is it. The question is about how does ffmpeg do it and naturally, if a job can be done in 2 seconds, I don't want to spend 2 to 5 minutes to do it.

@nonopolarity Did you manage to resolve this issue? I have the same problem I need a quick response to merging the files (in less than 5 seconds)

I tried to use the files separately with javascript in the browser, but Safari limits the amount of media that is running and ended up breaking my application

tookender commented 5 days ago

How can I merge video and audio to output an mp4?

My code:

        const audioStream = ytdl(URL as string, {
          filter: 'audioonly',
          quality: 'highestaudio',
        });

        const videoStream = ytdl(URL as string, {
          filter: (format) => format.hasVideo && (format.container === 'mp4' || format.container === 'webm'),
          quality: qualityOption,
        });

Did you manage to resolve it?

fent / node-ytdl-core

Is it true we have to combine the video and audio files using ffmpeg or the python or JS ffmpeg port? #1220