coursera-dl / edx-dl

A simple tool to download video lectures from edx.org (and other openedx sites)
GNU Lesser General Public License v3.0
1.93k stars 641 forks source link

ERROR: YouTube said: This video does not exist. #267

Closed yceron closed 9 years ago

yceron commented 9 years ago

Getting the following error trying to download MITx: 15.071x The Analytics Edge and then it crashes. ERROR: YouTube said: This video does not exist.

It downloaded the first three videos and the first PDF for the first unit and then it throws the error.

OS Version: Windows 10, 64bit Python version: 3.4 youtube-dl version: 2015.7.21

Course url: https://courses.edx.org/courses/course-v1:MITx+15.071x_2a+2T2015/info

Error output: [download] https://www.youtube.com/watch?v=iTpOrOvxx7o => Downloaded\The_Analytics_Edge\01-Unit_1-_An_Introduction_to_Analytics\03-%(title)s-%(id)s.%(ext)s Downloading video with URL https://www.youtube.com/watch?v=iTpOrOvxx7o from YouTube. [youtube] iTpOrOvxx7o: Downloading webpage [youtube] iTpOrOvxx7o: Downloading video info webpage ERROR: iTpOrOvxx7o: YouTube said: This video does not exist. Traceback (most recent call last): File "edx-dl.py", line 6, in edx_dl.main() File "C:\Temp\edx\edx_dl\edx_dl.py", line 903, in main download(args, selections, all_units, headers) File "C:\Temp\edx\edx_dl\edx_dl.py", line 734, in download headers) File "C:\Temp\edx\edx_dl\edx_dl.py", line 706, in download_unit skip_or_download(res_downloads, headers, args) File "C:\Temp\edx\edx_dl\edx_dl.py", line 665, in skip_or_download f(url, filename, headers, args) File "C:\Temp\edx\edx_dl\edx_dl.py", line 600, in download_url download_youtube_url(url, filename, headers, args) File "C:\Temp\edx\edx_dl\edx_dl.py", line 638, in download_youtube_url execute_command(cmd, args) File "C:\Temp\edx\edx_dl\utils.py", line 42, in execute_command raise e File "C:\Temp\edx\edx_dl\utils.py", line 37, in execute_command subprocess.check_call(cmd) File "C:\Python\Python34\lib\subprocess.py", line 561, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['youtube-dl', '--ignore-config', '-o', 'Downloaded\The_Analytics_Edge\01-Unit_1-_An_Introduction_to_Analytics\03-%(t itle)s-%(id)s.%(ext)s', '-f', 'mp4', 'https://www.youtube.com/watch?v=iTpOrOvxx7 o']' returned non-zero exit status 1

iemejia commented 9 years ago

If you open the webpage for the video it seems it is not available anymore. https://www.youtube.com/watch?v=iTpOrOvxx7o You can maybe try to download the course with the --prefer-cdn-videos to see if the video is available there.

Extra note: We have to fix the script to deal with non-zero return values from youtube-dl.

yceron commented 9 years ago

I tried with the CDN option, I forgot to add that, unfortunately got the same result.

balta2ar commented 9 years ago
We have to fix the script to deal with non-zero return values from youtube-dl.

@iemejia What's the strategy to deal with them? Simply run youtube-dl up to N times?

iemejia commented 9 years ago

@balta2ar For the moment we don't consider 'retry' (since I think youtube-dl does it), I was referring more to the fact that if the video is not available (like in that case) the script should continue downloading the rest of the available videos and don't break.

rbrito commented 9 years ago

Hi.

On Aug 03 2015, yceron wrote:

I tried with the CDN option, I forgot to add that, unfortunately got the same result.

Please, use the option --ignore-errors when invoking the script.

I'm still unsure what should be a sane (read: "default") behavior here. :(

There are some errors that are mostly benign (like the one above with youtube-dl saying that the video doesn't exist anymore), but some other SSL errors usually indicate serious problems and I'm not sure if we want to expose our users to such things by ignoring such errors.

I have not used the script for a few weeks now, but as I introduced these changes (i.e., the --ignore-errors option), I saw how inconvenient our internal downloader is, especially for large files---I think (but I am not sure) that the python function that we use may hold all the contents downloaded in memory, which is not good on many accounts.

If our internal downloader fails, then we have little control over it (especially for resuming downloads etc.). That's the reason why I pushed a super dirty change to the branch use-external-downloader:

https://github.com/coursera-dl/edx-dl/tree/use-external-downloader

As I like aria2c, that's what I use here, but I am thinking of merging the code for downloads from coursera-dl here to edx-dl.

And the internal downloader from coursera-dl could use some extra features, since we can't really expect users to have an external downloader at their disposal, especially if they are using Windows.

Anyway, I hope that this gives some of my perspective on the problem to other members, collaborators and users of the project.

Regards,

Rogério Brito : rbrito@{ime.usp.br,gmail.com} : GPG key 4096R/BCFCAAAA http://cynic.cc/blog/ : github.com/rbrito : profiles.google.com/rbrito DebianQA: http://qa.debian.org/developer.php?login=rbrito%40ime.usp.br

yceron commented 9 years ago

Using the --ignore-errors option worked without any problems. Thanks!

iemejia commented 9 years ago

I am closing this, Rogerio maybe it is a good idea to put the --ignore-errors somewhere in the troubleshooting doc. [talking about getting old and amnesic, I forgot this option too ;)].