ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.39k stars 9.96k forks source link

Wrong string length for playlist index, when using --playlist-start #24893

Open ljubex opened 4 years ago

ljubex commented 4 years ago

Checklist

Verbose log

➜  tmp youtube-dl -v --extract-audio --audio-format mp3 --yes-playlist --playlist-start 79 -o "%(playlist_index)s - %(title)s.%(ext)s" https://www.youtube.com/watch\?v\=Be6l_Tb2lUA\&list\=PLp-zVrvnok6yOuqRKvZy25Vi9g7FLVNVy
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', '--extract-audio', '--audio-format', 'mp3', '--yes-playlist', '--playlist-start', '79', '-o', '%(playlist_index)s - %(title)s.%(ext)s', 'https://www.youtube.com/watch?v=Be6l_Tb2lUA&list=PLp-zVrvnok6yOuqRKvZy25Vi9g7FLVNVy']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2020.03.24
[debug] Python version 3.5.3 (CPython) - Linux-4.19.66-v7+-armv7l-with-debian-9.11
[debug] exe versions: avconv 3.2.14-1, avprobe 3.2.14-1, ffmpeg 3.2.14-1, ffprobe 3.2.14-1, rtmpdump 2.4
[debug] Proxy map: {}
[youtube:playlist] Downloading playlist PLp-zVrvnok6yOuqRKvZy25Vi9g7FLVNVy - add --no-playlist to just download video Be6l_Tb2lUA
[youtube:playlist] PLp-zVrvnok6yOuqRKvZy25Vi9g7FLVNVy: Downloading webpage
[download] Downloading playlist: Rokeri S Moravu - Spotovi
[youtube:playlist] PLp-zVrvnok6yOuqRKvZy25Vi9g7FLVNVy: Downloading page #1
[youtube:playlist] playlist Rokeri S Moravu - Spotovi: Downloading 69 videos
[download] Downloading video 1 of 69
[youtube] U0SK0ryOszA: Downloading webpage
[youtube] {18} signature length 108, html5 player vflset
[youtube] {135} signature length 108, html5 player vflset
[youtube] {244} signature length 108, html5 player vflset
[youtube] {134} signature length 104, html5 player vflset
[youtube] {243} signature length 104, html5 player vflset
[youtube] {133} signature length 104, html5 player vflset
[youtube] {242} signature length 104, html5 player vflset
[youtube] {160} signature length 104, html5 player vflset
[youtube] {278} signature length 108, html5 player vflset
[youtube] {140} signature length 104, html5 player vflset
[youtube] {249} signature length 104, html5 player vflset
[youtube] {250} signature length 108, html5 player vflset
[youtube] {251} signature length 108, html5 player vflset
[debug] Invoking downloader on 'https://r5---sn-5go7yner.googlevideo.com/videoplayback?expire=1587377136&ei=kB-dXt3LJvrU7QSq3Lj4DQ&ip=185.65.135.181&id=o-APg0To0Qm2GKuB6TEZ2F9YfUDL9pciENoDjeJYMfFCSp&itag=251&source=youtube&requiressl=yes&mh=D2&mm=31%2C29&mn=sn-5go7yner%2Csn-5goeen7r&ms=au%2Crdu&mv=m&mvi=4&pl=24&initcwndbps=1581250&vprv=1&mime=audio%2Fwebm&gir=yes&clen=4114651&dur=245.361&lmt=1541964544505413&mt=1587355473&fvip=5&keepalive=yes&c=WEB&txp=5411222&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cgir%2Cclen%2Cdur%2Clmt&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=ALrAebAwRQIhAMZC4fWu9W4hMgal2GX5zbH-DL9z4vThocWLl_lf6IAHAiBRxGptUN0X9PLKS8cxt1z09Nqg8IjKTKzTHmzX-s2aew%3D%3D&sig=AJpPlLswRQIgQH98dR5lgX7SXdUk3we71aPQ6XEQr4hBirIwJ-ZLUcsCIQCumUyPy_aZ78tBNST4pGMuu2GPSOJuyO8nD65_raTF2w==&ratebypass=yes'
[download] Destination: 79 - Rokeri s Moravu - Jao, Druze Milicioneru - (Official Video).webm
[download] 100% of 3.92MiB in 00:02
[debug] ffmpeg command line: ffprobe -show_streams 'file:79 - Rokeri s Moravu - Jao, Druze Milicioneru - (Official Video).webm'
[ffmpeg] Destination: 79 - Rokeri s Moravu - Jao, Druze Milicioneru - (Official Video).mp3
[debug] ffmpeg command line: ffmpeg -y -loglevel repeat+info -i 'file:79 - Rokeri s Moravu - Jao, Druze Milicioneru - (Official Video).webm' -vn -acodec libmp3lame -q:a 5 'file:79 - Rokeri s Moravu - Jao, Druze Milicioneru - (Official Video).mp3'

Description

When downloading whole playlist, the length of the string that holds the index is calculated based on total number of items in the list.

For example, if I download the list above, the indices will be named: 001, 002, 003, etc (the size of three).

But if I continue from somewhere in the middle (like 79 in the example), the length of the index string will be calculated based on the number of remaining items in the list, not the total number.

So instead of 079, 080, 081, I'm getting 79, 80, 81....

willbeaufoy commented 4 years ago

I looked into this a bit, I believe it may be tricky to fix because when downloading partial playlists we do not know the total length of the playlist because ie_entries is a generator.

Of course we could convert the whole generator into a list, and then slice it after that, in order to get the total playlist length, but I'm not familiar enough with the codebase atm to know whether this would lead to too much unnecessary processing and/or network calls.

ljubex commented 4 years ago

But all the information is there, right? I mean, it's already working when downloading the full list (string length is correctly determined). In this case the list length is calculated as what's remaining. So, when calculating the string length, instead of using just "list length", it should be "list length + index" instead.

Haven't looked at the code yet though :)

willbeaufoy commented 4 years ago

Yes for your specific case (when --playlist-start alone is set) it is possible, as you say you can do something like list length + index. The actual code I came up with to replace template_dict['n_entries'] here was

template_dict['playlist_index'] + template_dict['n_entries'] - self._num_downloads

This would give the full length of the playlist, e.g. for video 79 it would be 79 + 69 - 1 = 147.

However this would not work in other cases, e.g. --playlist-start 79 --playlist-end 82. In this case template_dict['n_entries' is only 4, so while we have the correct playlist_index, we have no way of knowing the total number of entries in the playlist.

That's why I think in order to fix this properly it would have to be done in the bit of code I linked to in my previous comment.

ljubex commented 4 years ago

Well, it really depends what you want to dictate the string length. To me it makes perfect sense to have string of a length of 3, even if you're picking indices from 79 to 82 (because the whole list has 147 items).

For example, file names associated with indices 79 - 82 will be named the same, no matter whether I download the whole list, or I just pick indices 79 - 82. I'd personally prefer that behavior.

Use case: I download the whole list first. Then tomorrow I just re-download few items. I'd like that new files are formatted the same.

ljubex commented 4 years ago

But again, that's just a personal preference :)

willbeaufoy commented 4 years ago

Yes exactly, I agree that in the 79-82 case we want a string length of three because the total number of videos is 147. But there is no way for the code to know the total number of videos in the playlist at the point it calculates the string length, because all it knows is that it is downloading 4 videos with indices 79-82. Therefore for all it knows the total number of videos in the playlist could be 99, 999 or 9999 etc.