dgorissen / coursera-dl

A script for downloading course material (video's, pdfs, quizzes, etc) from coursera.org
http://dirkgorissen.com/2012/09/07/coursera-dl-a-coursera-download-script/
GNU General Public License v3.0
1.74k stars 300 forks source link

Does not handle invalid URL well #120

Closed bretmckee closed 10 years ago

bretmckee commented 10 years ago

I am taking introastro-002, and it appears to have a reference to a non-existing URL that is causing crash with stack backtrace (see below). Since there is only 1 such issue, I believe I have gotten everything else by using both forward and reverse (-r) downloads.


introastro-002\06_Week6-_Stars\08_Week6-_8_Astrometry.doc already downloaded Downloading: introastro-002\06_Week6-_Stars\08_Week6-_8_Astrometry.htm Downloading http://www.astroscience.org/abdul-ahad/astrometry.htm -> introastro-002\06_Week6-_Stars\08_Week6-_8_Astrometry.htm Starting new HTTP connection (1): www.astroscience.org Traceback (most recent call last): File "C:\Users\bretm\Coursera\coursera-dl\coursera-dl", line 6, in coursera_dl.main() File "C:\Users\bretm\Coursera\coursera-dl\coursera\coursera_dl.py", line 769, in main if download_class(args, class_name): File "C:\Users\bretm\Coursera\coursera-dl\coursera\coursera_dl.py", line 749, in download_class args.intact_fnames) File "C:\Users\bretm\Coursera\coursera-dl\coursera\coursera_dl.py", line 393, in download_lectures downloader.download(url, lecfn) File "C:\Users\bretm\Coursera\coursera-dl\coursera\downloaders.py", line 43, in download self._start_download(url, filename) File "C:\Users\bretm\Coursera\coursera-dl\coursera\downloaders.py", line 273, in _start_download r = self.session.get(url, stream=True) File "c:\Python27-32bit\lib\site-packages\requests\sessions.py", line 347, in get return self.request('GET', url, _kwargs) File "c:\Python27-32bit\lib\site-packages\requests\sessions.py", line 335, in request resp = self.send(prep, _send_kwargs) File "c:\Python27-32bit\lib\site-packages\requests\sessions.py", line 438, in send r = adapter.send(request, **kwargs) File "c:\Python27-32bit\lib\site-packages\requests\adapters.py", line 327, in send raise ConnectionError(e) requests.exceptions.ConnectionError: HTTPConnectionPool(host='www.astroscience.org', port=80): Max retries exceeded with url: /abdul-ahad/astrometry.htm (Caused by <class 'socket.gaierror'>: [Errno 11 001] getaddrinfo failed)

dgorissen commented 10 years ago

Can you check with the current master branch and see if that resolves it for you?

bretmckee commented 10 years ago

I believe that is what I'm running -- commit c9ff3dee3cc3a97df3942e633144872e8ab7f3c2

dgorissen commented 10 years ago

sorry for the delay, but just tested this with the master branch and it downloads perfectly with me

bretmckee commented 10 years ago

I did a fresh git pull today and everything seems to be working now. Thanks