dgorissen / coursera-dl

A script for downloading course material (video's, pdfs, quizzes, etc) from coursera.org
http://dirkgorissen.com/2012/09/07/coursera-dl-a-coursera-download-script/
GNU General Public License v3.0
1.74k stars 300 forks source link

Unable to download archived course #38

Closed agroferia closed 11 years ago

agroferia commented 11 years ago

Hi! Thanks for the program! I tried to download an archived course, specifically: Computer Architecture by David Wentzlaff. Code name: comparch-2012-001.

image

That's the error that gave me (Course 12 of 12). Sorry I don't know how to copy the text log!

dgorissen commented 11 years ago

I did not get this error but did find another subtle problem. Please upgrade and try again.

dgorissen commented 11 years ago

No reply so assuming ok, closing.

rmchiriac commented 11 years ago

Same issue here, but on linux. After a bit of debugging turned out that lxml wasn't parsing properly the page content. Could be related to the fact that I viewed some lessons on coursera's site?

Anyhow, switched to html5lib and it worked.

Cheers, Radu

dgorissen commented 11 years ago

Mmm strange, I couldn't reproduce it, reopening.

rmchiriac commented 11 years ago

It's not a coursera-dl issue: I've checked the html downloaded from coursera - seemed ok (even validated it with validator.w3.org). It's lxml... With the coursera-dl patched to use html5lib I've downloaded ~30 courses for later viewing w/o any issue.

dgorissen commented 11 years ago

had another look, I have not run into these issues but if switching to html5lib works (you can use the -q option) then thats fine and I will close this.