iffy-pi / ytdl-server

Server which provides a REST API for downloading YouTube videos
0 stars 0 forks source link

Actual Implementation #1

Open iffy-pi opened 2 months ago

iffy-pi commented 2 months ago

This idea came from me constantly having to go to my laptop to download sections of a YouTube video using yt-dlp. I was able to find out that yt-dlp provides a Python module which is able to download YouTube videos into system memory rather than as a system file.

This would mean I could build a simple flask server which handles YouTube requests and deploy it to Vercel without having to write any data to their server.

The code for this should be (sourced from this GitHub issue: https://github.com/yt-dlp/yt-dlp/issues/3298)

from yt_dlp import YoutubeDL
from contextlib import redirect_stdout
from pathlib import Path

youtube_id = "some-video-id"

ctx = {
    "outtmpl": "-",
    'logtostderr': True
}

buffer = io.BytesIO()
with redirect_stdout(buffer), YoutubeDL(ctx) as foo:
    foo.download([youtube_id])

# write out the buffer for demonstration purposes
Path(f"{youtube_id}.mp4").write_bytes(buffer.getvalue())
iffy-pi commented 2 months ago

Python package for yt-dlp: https://pypi.org/project/yt-dlp/

iffy-pi commented 2 months ago

The code I talked about in my initial comment does work to download videos to a buffer but I run into an issue when I try to download a specific section of a YouTube video, which is my most common use case.

The first thing I did was figure out the dictionary key value pair to use when defining the video section to download.

I combined the new parameter values with the ones provided in the code (I kept value of outtmpl from original because I knew that it redirects to stdout).

I passed this dictionary in for but an issue occurs. It seems that the bytes of the downloaded video are written to stdout (the console) rather than the buffer. This results in an empty buffer and therefore no file content.

With more testing I was able to isolate the error to the simplest case:

from yt_dlp import YoutubeDL
from yt_dlp.utils import download_range_func
from contextlib import redirect_stdout
from pathlib import Path
import io

youtube_id = "GDRyigWvUFg"

# Youtube DL parameters
# Was created using convert_cli_to_embed.py
# See Issue #1 for more details
ctx = {
    'download_ranges': download_range_func([], [[10, 20]]), # download section 10s to 20
    "outtmpl": {'default': "-"}, # Download to stdout not to file
    'quiet': True, # Quiet, no logs will be written
    'logtostderr': True, # Write logs to stderr, need this despite quiet because of unrelated error
}

# Redirect stdout to the buffer and download video
buffer = io.BytesIO()
with redirect_stdout(buffer), YoutubeDL(ctx) as foo:
    foo.download([youtube_id])

# Buffer should have video contents, write contents of buffer to file for example
Path(f"output3.mp4").write_bytes(buffer.getvalue())

The error occurs specifically when the output of the downloader is set to stdout and we are downloading video sections. More specifically if the outtmpl key is set to - as the output and there is a download_ranges key in the dictionary passed to the constructor.

I've tried redirecting this output to a binary file but I can't seem to get the encoding format right for it to be recognized as a proper video file but I've been unsuccessful. Either way this does not solve the issue of the video file not being written to the buffer.

iffy-pi commented 2 months ago

I submitted an issue to yt-dlp because I am not sure what else to do: https://github.com/yt-dlp/yt-dlp/issues/11051. In the meantime, I will just wait and see if I can think of something else