yt-dlp / yt-dlp

A feature-rich command-line audio/video downloader
https://discord.gg/H5MNcFW63r
The Unlicense
89.14k stars 6.91k forks source link

[youtube] chapter extraction with ambigious chapters in description #520

Closed Lesmiscore closed 3 years ago

Lesmiscore commented 3 years ago

Checklist

Verbose log

$ yt-dlp -N 16 --downloader-args "ffmpeg:-loglevel warning -stats" --downloader-args "aria2c:-q" --external-downloader aria2c --trim-filename 150  --force-ipv4 --sleep-interval 2 --sleep-requests 2 --max-sleep-interval 5 --ignore-errors --no-continue --no-overwrites --download-archive archive.log --add-metadata --write-description --write-info-json --write-annotations --write-thumbnail --embed-thumbnail --all-subs --embed-subs --output "%(playlist)s - (%(uploader)s)/%(title)s [%(id)s].%(ext)s" --merge-output-format "mkv" _8GUNJZwAQ0 -v
[debug] Command-line config: ['-N', '16', '--downloader-args', 'ffmpeg:-loglevel warning -stats', '--downloader-args', 'aria2c:-q', '--external-downloader', 'aria2c', '--trim-filename', '150', '--force-ipv4', '--sleep-interval', '2', '--sleep-requests', '2', '--max-sleep-interval', '5', '--ignore-errors', '--no-continue', '--no-overwrites', '--download-archive', 'archive.log', '--add-metadata', '--write-description', '--write-info-json', '--write-annotations', '--write-thumbnail', '--embed-thumbnail', '--all-subs', '--embed-subs', '--output', '%(playlist)s - (%(uploader)s)/%(title)s [%(id)s].%(ext)s', '--merge-output-format', 'mkv', '_8GUNJZwAQ0', '-v']
[debug] Loading archive file 'archive.log'

[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] yt-dlp version 2021.07.17.1626506853 (zip)
[debug]        from commit d05ae24
[debug] Python version 3.8.10 (CPython 64bit) - Linux-5.8.0-1031-oracle-aarch64-with-glibc2.29
[debug] exe versions: ffmpeg 4.2.4, ffprobe 4.2.4
[debug] Proxy map: {}
[debug] [youtube] Extracting URL: _8GUNJZwAQ0
[youtube] _8GUNJZwAQ0: Downloading webpage
[youtube] [debug] Fetching webpage from https://www.youtube.com/watch?v=_8GUNJZwAQ0&bpctr=9999999999&has_verified=1
[youtube] Sleeping 2.0 seconds ...
[youtube] _8GUNJZwAQ0: Downloading android player API JSON
[youtube] [debug] Fetching webpage from https://www.youtube.com/youtubei/v1/player?key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8
[youtube] Sleeping 2.0 seconds ...
[youtube] _8GUNJZwAQ0: Downloading MPD manifest
[youtube] [debug] Fetching webpage from https://manifest.googlevideo.com/api/manifest/dash/expire/1626531318/ei/lpHyYKP_NcfhqQHr1oP4DA/ip/152.70.87.46/id/ffc194349670010d/source/youtube/requiressl/yes/playback_host/r2---sn-3pm7dn7y.googlevideo.com/mh/xh/mm/31%2C29/mn/sn-3pm7dn7y%2Csn-3pm7kn76/ms/au%2Crdu/mv/m/mvi/2/pl/23/tx/24027932/txs/24027931%2C24027932/hfr/1/as/fmp4_audio_clear%2Cfmp4_sd_hd_clear/initcwndbps/8393750/vprv/1/mt/1626509344/fvip/2/itag_bl/376%2C377%2C384%2C385%2C409%2C410%2C411%2C412%2C557%2C558%2C612%2C613%2C617%2C619%2C623%2C628%2C655%2C656%2C660%2C662%2C666%2C671/keepalive/yes/fexp/24001373%2C24007246/itag/0/sparams/expire%2Cei%2Cip%2Cid%2Csource%2Crequiressl%2Ctx%2Ctxs%2Chfr%2Cas%2Cvprv%2Citag/sig/AOq0QJ8wRAIgdotx605EmBlNGB0KAr8OaN9_sxWu5rNwzWAspaN4Uv4CIC1RmEBhBE8cfU2QQvOMNaEwrUPLpFXABf2K_jeF9DZb/lsparams/playback_host%2Cmh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps/lsig/AG3C_xAwRQIgISOzyhMg7R4-R6kQCiQR02AcpWlaur6OUspM4nXt_XICIQDCMuh4gaft9Fi4VbQkGMOPF1WhNnQgTuiM_te8xNgkIg%3D%3D
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, vcodec:vp9.2(10), acodec, filesize, fs_approx, tbr, vbr, abr, asr, proto, vext, aext, hasaud, source, id
[debug] Default format spec: bestvideo*+bestaudio/best
[info] _8GUNJZwAQ0: Downloading 1 format(s): 137+251
[info] Video description is already present
WARNING: There are no annotations to write.
[info] Video metadata is already present
[youtube] _8GUNJZwAQ0: Thumbnail is already present
[debug] locking youtube__8GUNJZwAQ0.lock
[download] NA - (Computations in Finance)/Computational Finance - Lecture 2_14 (Stock, Options and Stochastics) [_8GUNJZwAQ0].mkv has already been downloaded
[debug] unlocking youtube__8GUNJZwAQ0.lock
[debug] ffprobe command line: ffprobe -hide_banner -show_format -show_streams -print_format json 'file:NA - (Computations in Finance)/Computational Finance - Lecture 2_14 (Stock, Options and Stochastics) [_8GUNJZwAQ0].mkv'
[Metadata] Adding metadata to "NA - (Computations in Finance)/Computational Finance - Lecture 2_14 (Stock, Options and Stochastics) [_8GUNJZwAQ0].mkv"
[debug] ffmpeg command line: ffmpeg -y -loglevel repeat+info -i 'file:NA - (Computations in Finance)/Computational Finance - Lecture 2_14 (Stock, Options and Stochastics) [_8GUNJZwAQ0].mkv' -i 'file:NA - (Computations in Finance)/Computational Finance - Lecture 2_14 (Stock, Options and Stochastics) [_8GUNJZwAQ0].meta' -map 0 -dn -c copy -metadata 'title=Computational Finance: Lecture 2/14 (Stock, Options and Stochastics)' -metadata date=20210217 -metadata 'description=Computational Finance 
Lecture 2- Stock, Options and Stochastics
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
This course is based on the book:
"Mathematical Modeling and Computation in Finance: With Exercises and Python and MATLAB Computer Codes", by C.W. Oosterlee and L.A. Grzelak, World Scientific Publishing, 2019.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
- Codes and the slides can be found at: https://github.com/LechGrzelak/Computational-Finance-Course
- See https://quantfinancebook.com/ for more details and for additional materials.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
0:00 Introduction
3:21 Trading of Options and Hedging
20:13 Commodities
25:59 Currencies and Cryptos
38:22 Value of Call and Put Options and Hedging
1:00:57 Modeling of Asset Prices and Randomness
1:10:20 Stochastic Processes for Stock Prices
1:27:27 Ito’s Lemma for Solving SDEs
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
CONTENT OF THIS COURSE:
Lecture 1- Introduction and Overview of Asset Classes
***** Lecture 2- Stock, Options and Stochastics
Lecture 3- Option Pricing and Simulation in Python
Lecture 4- Implied Volatility
Lecture 5- Jump Processes
Lecture 6- Affine Jump Diffusion Processes
Lecture 7- Stochastic Volatility Models
Lecture 8- Fourier Transformation for Option Pricing
Lecture 9- Monte Carlo Simulation
Lecture 10- Monte Carlo Simulation of the Heston Model
Lecture 11- Hedging and Monte Carlo Greeks
Lecture 12- Forward Start Options and Model of Bates
Lecture 13- Exotic Derivatives
Lecture 14- Summary
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Corrections:
- Around 9:20- 9:40 correct statement is: call option price DECREASES for increasing strike.' -metadata 'synopsis=Computational Finance 
Lecture 2- Stock, Options and Stochastics
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
This course is based on the book:
"Mathematical Modeling and Computation in Finance: With Exercises and Python and MATLAB Computer Codes", by C.W. Oosterlee and L.A. Grzelak, World Scientific Publishing, 2019.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
- Codes and the slides can be found at: https://github.com/LechGrzelak/Computational-Finance-Course
- See https://quantfinancebook.com/ for more details and for additional materials.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
0:00 Introduction
3:21 Trading of Options and Hedging
20:13 Commodities
25:59 Currencies and Cryptos
38:22 Value of Call and Put Options and Hedging
1:00:57 Modeling of Asset Prices and Randomness
1:10:20 Stochastic Processes for Stock Prices
1:27:27 Ito’s Lemma for Solving SDEs
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
CONTENT OF THIS COURSE:
Lecture 1- Introduction and Overview of Asset Classes
***** Lecture 2- Stock, Options and Stochastics
Lecture 3- Option Pricing and Simulation in Python
Lecture 4- Implied Volatility
Lecture 5- Jump Processes
Lecture 6- Affine Jump Diffusion Processes
Lecture 7- Stochastic Volatility Models
Lecture 8- Fourier Transformation for Option Pricing
Lecture 9- Monte Carlo Simulation
Lecture 10- Monte Carlo Simulation of the Heston Model
Lecture 11- Hedging and Monte Carlo Greeks
Lecture 12- Forward Start Options and Model of Bates
Lecture 13- Exotic Derivatives
Lecture 14- Summary
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Corrections:
- Around 9:20- 9:40 correct statement is: call option price DECREASES for increasing strike.' -metadata 'purl=https://www.youtube.com/watch?v=_8GUNJZwAQ0' -metadata 'comment=https://www.youtube.com/watch?v=_8GUNJZwAQ0' -metadata 'artist=Computations in Finance' -map_metadata 1 -attach 'NA - (Computations in Finance)/Computational Finance - Lecture 2_14 (Stock, Options and Stochastics) [_8GUNJZwAQ0].info.json' -metadata:s:2 mimetype=application/json 'file:NA - (Computations in Finance)/Computational Finance - Lecture 2_14 (Stock, Options and Stochastics) [_8GUNJZwAQ0].temp.mkv'
ERROR: ffmpeg version 4.2.4-1ubuntu0.1 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.3.0-10ubuntu2)
  configuration: --prefix=/usr --extra-version=1ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/aarch64-linux-gnu --incdir=/usr/include/aarch64-linux-gnu --arch=arm64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Input #0, matroska,webm, from 'file:NA - (Computations in Finance)/Computational Finance - Lecture 2_14 (Stock, Options and Stochastics) [_8GUNJZwAQ0].mkv':
  Metadata:
    COMPATIBLE_BRANDS: iso6avc1mp41
    MAJOR_BRAND     : dash
    MINOR_VERSION   : 0
    ENCODER         : Lavf58.29.100
  Duration: 01:41:37.82, start: -0.007000, bitrate: 410 kb/s
    Stream #0:0: Video: h264 (High), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 30 fps, 30 tbr, 1k tbn, 60 tbc (default)
    Metadata:
      HANDLER_NAME    : ISO Media file produced by Google Inc.
      DURATION        : 01:41:37.800000000
    Stream #0:1(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
    Metadata:
      DURATION        : 01:41:37.821000000
[ffmetadata @ 0xaaab0414abe0] Chapter end time 560000 before start 5247000
file:NA - (Computations in Finance)/Computational Finance - Lecture 2_14 (Stock, Options and Stochastics) [_8GUNJZwAQ0].meta: Cannot allocate memory
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/lesmi/yt-dlp/__main__.py", line 24, in <module>
    yt_dlp.main()
  File "/home/lesmi/yt-dlp/yt_dlp/__init__.py", line 737, in main
    _real_main(argv)
  File "/home/lesmi/yt-dlp/yt_dlp/__init__.py", line 727, in _real_main
    retcode = ydl.download(all_urls)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 2822, in download
    res = self.extract_info(
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 1186, in extract_info
    return self.__extract_info(url, ie, download, extra_info, process)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 1193, in wrapper
    return func(self, *args, **kwargs)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 1239, in __extract_info
    return self.process_ie_result(ie_result, download, extra_info)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 1280, in process_ie_result
    ie_result = self.process_video_result(ie_result, download=download)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 2262, in process_video_result
    self.process_info(new_info)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 2792, in process_info
    info_dict = self.post_process(dl_filename, info_dict, files_to_move)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 2932, in post_process
    info = self.run_pp(pp, info)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 2881, in run_pp
    files_to_delete, infodict = pp.run(infodict)
  File "/home/lesmi/yt-dlp/yt_dlp/postprocessor/common.py", line 83, in wrapper
    return func(self, info)
  File "/home/lesmi/yt-dlp/yt_dlp/postprocessor/ffmpeg.py", line 643, in run
    self.run_ffmpeg_multiple_files(
  File "/home/lesmi/yt-dlp/yt_dlp/postprocessor/ffmpeg.py", line 240, in run_ffmpeg_multiple_files
    return self.real_run_ffmpeg(
  File "/home/lesmi/yt-dlp/yt_dlp/postprocessor/ffmpeg.py", line 277, in real_run_ffmpeg
    self.report_error(stderr)
  File "/home/lesmi/yt-dlp/yt_dlp/postprocessor/common.py", line 56, in report_error
    return self._downloader.report_error(text, *args, **kwargs)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 833, in report_error
    self.trouble(error_message, tb)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 789, in trouble
    tb_data = traceback.format_list(traceback.extract_stack())

ERROR: Postprocessing: file:NA - (Computations in Finance)/Computational Finance - Lecture 2_14 (Stock, Options and Stochastics) [_8GUNJZwAQ0].meta: Cannot allocate memory
Traceback (most recent call last):
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 2792, in process_info
    info_dict = self.post_process(dl_filename, info_dict, files_to_move)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 2932, in post_process
    info = self.run_pp(pp, info)
  File "/home/lesmi/yt-dlp/yt_dlp/YoutubeDL.py", line 2881, in run_pp
    files_to_delete, infodict = pp.run(infodict)
  File "/home/lesmi/yt-dlp/yt_dlp/postprocessor/common.py", line 83, in wrapper
    return func(self, info)
  File "/home/lesmi/yt-dlp/yt_dlp/postprocessor/ffmpeg.py", line 643, in run
    self.run_ffmpeg_multiple_files(
  File "/home/lesmi/yt-dlp/yt_dlp/postprocessor/ffmpeg.py", line 240, in run_ffmpeg_multiple_files
    return self.real_run_ffmpeg(
  File "/home/lesmi/yt-dlp/yt_dlp/postprocessor/ffmpeg.py", line 278, in real_run_ffmpeg
    raise FFmpegPostProcessorError(stderr.split('\n')[-1])
yt_dlp.postprocessor.ffmpeg.FFmpegPostProcessorError: file:NA - (Computations in Finance)/Computational Finance - Lecture 2_14 (Stock, Options and Stochastics) [_8GUNJZwAQ0].meta: Cannot allocate memory

Description

https://youtu.be/_8GUNJZwAQ0

Originally reported by @Ashish0804 in Discord.

This error seems to be caused by ffmpeg, but this is originally caused by YoutubeIE's chapter extraction.

Here's chapter metadata generated by yt-dlp:

$ cat NA\ -\ \(Computations\ in\ Finance\)/Computational\ Finance\ -\ Lecture\ 2_14\ \(Stock\,\ Options\ and\ Stochastics\)\ \[_8GUNJZwAQ0\].meta 
;FFMETADATA1
[CHAPTER]
TIMEBASE=1/1000
START=560000
END=0
title=Around. correct statement is: call option price DECREASES for increasing strike.
[CHAPTER]
TIMEBASE=1/1000
START=0
END=201000
title=Introduction
[CHAPTER]
TIMEBASE=1/1000
START=201000
END=1213000
title=Trading of Options and Hedging
[CHAPTER]
TIMEBASE=1/1000
START=1213000
END=1559000
title=Commodities
[CHAPTER]
TIMEBASE=1/1000
START=1559000
END=2302000
title=Currencies and Cryptos
[CHAPTER]
TIMEBASE=1/1000
START=2302000
END=3657000
title=Value of Call and Put Options and Hedging
[CHAPTER]
TIMEBASE=1/1000
START=3657000
END=4220000
title=Modeling of Asset Prices and Randomness
[CHAPTER]
TIMEBASE=1/1000
START=4220000
END=5247000
title=Stochastic Processes for Stock Prices
[CHAPTER]
TIMEBASE=1/1000
START=5247000
END=6098000
title=Ito’s Lemma for Solving SDEs

and "real" chapters here:

0:00 Introduction
3:21 Trading of Options and Hedging
20:13 Commodities
25:59 Currencies and Cryptos
38:22 Value of Call and Put Options and Hedging
1:00:57 Modeling of Asset Prices and Randomness
1:10:20 Stochastic Processes for Stock Prices
1:27:27 Ito’s Lemma for Solving SDEs

Chapter starting with "Around correct ..." is misdetected one, causing it to break.

Lesmiscore commented 3 years ago

memo: another misdetected chapters

;FFMETADATA1                                                                                                                                           
[CHAPTER]                                                                                                                                              
TIMEBASE=1/1000                                                                                                                                        
START=0                                                                                                                                                
END=201000                                                                                                                                             
title=Introduction                                                                                                                                     
[CHAPTER]                                                                                                                                              
TIMEBASE=1/1000                                                                                                                                        
START=201000                                                                                                                                           
END=1213000                                                                                                                                            
title=Trading of Options and Hedging                                                                                                                   
[CHAPTER]                                                                                                                                              
TIMEBASE=1/1000                                                                                                                                        
START=1213000                                                                                                                                          
END=1559000                                                                                                                                            
title=Commodities                                                                                                                                      
[CHAPTER]                                                                                                                                              
TIMEBASE=1/1000                                                                                                                                        
START=1559000                                                                                                                                          
END=2302000                                                                                                                                            
title=Currencies and Cryptos                                                                                                                           
[CHAPTER]                                                                                                                                              
TIMEBASE=1/1000                                                                                                                                        
START=2302000                                                                                                                                          
END=3657000                                                                                                                                            
title=Value of Call and Put Options and Hedging                                                                                                        
[CHAPTER]                                                                                                                                              
TIMEBASE=1/1000                                                                                                                                        
START=3657000                                                                                                                                          
END=4220000                                                                                                                                            
title=Modeling of Asset Prices and Randomness                                                                                                          
[CHAPTER]                                                                                                                                              
TIMEBASE=1/1000                                                                                                                                        
START=4220000                                                                                                                                          
END=5247000                                                                                                                                            
title=Stochastic Processes for Stock Prices
[CHAPTER]
TIMEBASE=1/1000
START=5247000
END=560000
title=Ito’s Lemma for Solving SDEs
[CHAPTER]
TIMEBASE=1/1000
START=560000
END=6098000
title=Around. correct statement is: call option price DECREASES for increasing strike.
pukkandan commented 3 years ago

The misdetection is caused by youtube itself and not by yt-dlp. I have added some sanity check to the start_times which will fix this particular case. But it is not possible to catch all misdetections