ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
132.23k stars 10.03k forks source link

How to use progress_hooks when I use aria2c? #31188

Closed teddy171 closed 2 years ago

teddy171 commented 2 years ago

Checklist

Question

def my_hook(data):
     print(data)
def download_video(content, location):
     ydl_opts = {
         "outtmpl": f"{location}/%(id)s/%(title)s.%(ext)s",
         'progress_hooks': [my_hook],
         # "external_downloader": "aria2c",
         # "external_downloader_args": ["-x 16", "-k 1M", "-c", "-n"]
     }
     with youtube_dl.YoutubeDL(ydl_opts) as ydl:
         ydl.download([content])

When I doesn't use aria2, the output is

>>> download_video("https://www.bilibili.com/video/BV1v14y1t75o?spm_id_from=333.1007.tianma.1-2-2.click", ".")
[BiliBili] 1v14y1t75o: Downloading webpage
[BiliBili] 1v14y1t75o: Downloading video info page
[download] Destination: ./1v14y1t75o/市面主流固态硬盘品牌,售后大调查!.flv
[download]   0.0% of 48.02MiB at 954.99KiB/s ETA 00:54{'status': 'downloading', 'downloaded_bytes': 1024, 'total_bytes': 50348167, 'tmpfilename': './1v14y1t75o/市面主流固态硬盘品牌,售后大调查!.flv.part', 'filename': './1v14y1t75o/市面主流固态硬盘品牌,售后大调查!.flv', 'eta': 54, 'speed': 977906.9435336976, 'elapsed': 0.39966416358947754, '_eta_str': '00:54', '_percent_str': '  0.0%', '_speed_str': '954.99KiB/s', '_total_bytes_str': '48.02MiB'}
......

But when I use aria2, the output became

>>> download_video("https://www.bilibili.com/video/BV1v14y1t75o?spm_id_from=333.1007.tianma.1-2-2.click", ".")
[BiliBili] 1v14y1t75o: Downloading webpage
[BiliBili] 1v14y1t75o: Downloading video info page
[download] Destination: ./1v14y1t75o/市面主流固态硬盘品牌,售后大调查!.flv

08/20 19:54:48 [NOTICE] Downloading 1 item(s)

08/20 19:54:48 [NOTICE] Allocating disk space. Use --file-allocation=none to disable it. See --file-allocation option in man page for more details.
[#d734cd 47MiB/48MiB(99%) CN:2 DL:1.1MiB]
08/20 19:55:24 [NOTICE] Download complete: ./1v14y1t75o/市面主流固态硬盘品牌,售后大调查!.flv.part

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
d734cd|OK  |   1.2MiB/s|./1v14y1t75o/市面主流固态硬盘品牌,售后大调查!.flv.part

Status Legend:
(OK):download completed.
[aria2c] Downloaded 50348167 bytes
[download] 100% of 48.02MiB in 00:36
{'filename': './1v14y1t75o/市面主流固态硬盘品牌,售后大调查!.flv', 'status': 'finished', 'elapsed': 36.419508934020996, 'downloaded_bytes': 50348167, 'total_bytes': 50348167, '_total_bytes_str': '48.02MiB', '_elapsed_str': '00:36'}

It only print when the download is completed.How can I solve it?

dirkf commented 2 years ago

"Doctor, when I hit myself with the hammer it hurts. What can I do?"

Progress hooks are Python code called during the download. If you use an external downloader you make that responsible for the download until it returns to the calling Python downloader code; meanwhile, your hook ordinarily has no access to the progress of the external downloader.

Maybe aria2c has similar progress display options that you could set in your code (external_downloader_args). Otherwise you'd have to write your own external downloader class to wrap the execution of the downloader, monitoring its output in one thread to create the progress data and reporting it with your hook in another thread.

See also #30444.

pukkandan commented 2 years ago

It is possible to implement progress hooks for external downloaders, but it's not easy. See https://github.com/yt-dlp/yt-dlp/pull/2475 https://github.com/yt-dlp/yt-dlp/pull/3724

dirkf commented 2 years ago

... it's not easy.

Especially in historic Pythons. However https://github.com/yt-dlp/yt-dlp/pull/3724 doesn't seem an impossible example for OP's problem. I suspect it's way more work than OP would want just to override the aria2c progress output.

teddy171 commented 2 years ago

"Doctor, when I hit myself with the hammer it hurts. What can I do?"

Progress hooks are Python code called during the download. If you use an external downloader you make that responsible for the download until it returns to the calling Python downloader code; meanwhile, your hook ordinarily has no access to the progress of the external downloader.

Maybe aria2c has similar progress display options that you could set in your code (external_downloader_args). Otherwise you'd have to write your own external downloader class to wrap the execution of the downloader, monitoring its output in one thread to create the progress data and reporting it with your hook in another thread.

See also #30444.

I had tried your method. I used a library called aria2p, It can control aria2c by rpc, and I can monitor the rate of progress. But youtube-dl raise ValueError, I think youtube-dl only can use specified external downloaders.

Here is the code

def download_video(content, location):
    ydl_opts = {
        "outtmpl": f"{location}/%(id)s/%(title)s.%(ext)s", 
        "external_downloader": "aria2p",
        "external_downloader_args": ["add"]
    }
    with youtube_dl.YoutubeDL(ydl_opts) as ydl:
        ydl.download([content])

and return

>>> download_video("https://www.bilibili.com/video/BV1nd4y1d7A2?spm_id_from=444.41.list.card_archive.click&vd_source=f8365b6ee9c11d2ce85d1f57e2db1b97&t=11.8", ".")
[BiliBili] 1nd4y1d7A2: Downloading webpage
[BiliBili] 1nd4y1d7A2: Downloading video info page
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 8, in download_video
  File "/Users/teddy/miniconda3/lib/python3.8/site-packages/youtube_dl/YoutubeDL.py", line 2068, in download
    res = self.extract_info(
  File "/Users/teddy/miniconda3/lib/python3.8/site-packages/youtube_dl/YoutubeDL.py", line 808, in extract_info
    return self.__extract_info(url, ie, download, extra_info, process)
  File "/Users/teddy/miniconda3/lib/python3.8/site-packages/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
  File "/Users/teddy/miniconda3/lib/python3.8/site-packages/youtube_dl/YoutubeDL.py", line 847, in __extract_info
    return self.process_ie_result(ie_result, download, extra_info)
  File "/Users/teddy/miniconda3/lib/python3.8/site-packages/youtube_dl/YoutubeDL.py", line 881, in process_ie_result
    return self.process_video_result(ie_result, download=download)
  File "/Users/teddy/miniconda3/lib/python3.8/site-packages/youtube_dl/YoutubeDL.py", line 1692, in process_video_result
    self.process_info(new_info)
  File "/Users/teddy/miniconda3/lib/python3.8/site-packages/youtube_dl/YoutubeDL.py", line 1976, in process_info
    success = dl(filename, info_dict)
  File "/Users/teddy/miniconda3/lib/python3.8/site-packages/youtube_dl/YoutubeDL.py", line 1910, in dl
    fd = get_suitable_downloader(info, self.params)(self, self.params)
  File "/Users/teddy/miniconda3/lib/python3.8/site-packages/youtube_dl/downloader/__init__.py", line 42, in get_suitable_downloader
    ed = get_external_downloader(external_downloader)
  File "/Users/teddy/miniconda3/lib/python3.8/site-packages/youtube_dl/downloader/external.py", line 371, in get_external_downloader
    return _BY_NAME[bn]
KeyError: 'aria2p'

It even can't start downloading.

dirkf commented 2 years ago

True.

Two possible options:

pawamoy commented 2 years ago

aria2p is also Python library: couldn't you @teddy171 use it as such rather than as an external process?

In any case, aria2p is just a Python client for the aria2c daemon, so the latter needs to run. The add method will not print progress either.

dirkf commented 2 years ago

I think we've given a few leads especially https://github.com/ytdl-org/youtube-dl/issues/31188#issuecomment-1221393687, so closing unless there's a specific follow-up.