coursera-dl / edx-dl

A simple tool to download video lectures from edx.org (and other openedx sites)
GNU Lesser General Public License v3.0
1.93k stars 638 forks source link

'charmap' codec can't encode character u'\u2013' in position 28: character maps to <undefined> #81

Closed maxdd closed 9 years ago

maxdd commented 10 years ago

You can access 19 courses 1 - CS-184.1x Foundations of Computer Graphics -> Started 2 - CS.169.2x Software as a Service, Part 2 (rev Fall 2013) -> Started 3 - CS188.1x Artificial Intelligence -> Started 4 - CS169.1x Engineering Software as a Service -> Not yet 5 - CS169.2x Engineering Software as a Service, Part 2 -> Not yet 6 - AE1110x Introduction to Aeronautical Engineering -> Not yet 7 - BIO465x Neuronal Dynamics -> Started 8 - AMRx Autonomous Mobile Robots -> Not yet 9 - CS50x Introduction to Computer Science -> Started 10 - 16.101x Introduction to Aerodynamics -> Started 11 - 16.110x Flight Vehicle Aerodynamics -> Not yet 12 - 2.03x Dynamics -> Started 13 - 6.00.1x Introduction to Computer Science and Programming -> Started 14 - 6.002x Circuits and Electronics -> Started 15 - ELEC301x Discrete Time Signals and Systems -> Not yet 16 - 20220332X Principles of Electric Circuits: Part 1 -> Started 17 - 20220332_2x Principles of Electric Circuits: Part 2 -> Not yet Traceback (most recent call last): File "C:\Python27\edx-downloader\edx-dl.py", line 403, in main() File "C:\Python27\edx-downloader\edx-dl.py", line 271, in main print('%d - %s -> %s' % (c, course[0], course[2])) File "C:\Python27\lib\encodings\cp850.py", line 12, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode character u'\u2013' in position 28: character maps to

maxdd commented 10 years ago

the 18th course (where it probably crash )is

HYPERS301x Hypersonics – from Shock Waves to Scramjets

the 19th course is

UTAustinX: UT.6.01x Embedded Systems - Shape the World

maxdd commented 10 years ago

https://stackoverflow.com/questions/5387895/unicodeencodeerror-ascii-codec-cant-encode-character-u-u2013-in-position-3

maybe this could help you

crypdick commented 10 years ago

I am experiencing a similar error except with the 'ascii' codec

maxdd commented 10 years ago

If you know the number of the course you want, you can just open the *.py and comment out that print in order to move on with the program... it's not a solution but it should works

crypdick commented 10 years ago

@maxdd I tried commenting out the line with the commented with the print statement giving the error (line 371) but then the script doesn't run. ^Nevermind that was a network issue

Script runs but after week selection, processing, the script hangs on "[info] Output directory: Downloaded"

maxdd commented 10 years ago

mhmm cant help you then :( , if only shk3 could fix this ....

shk3 commented 10 years ago

Hi @maxdd and @isomerase, Thanks for your feedback. Are you all using Windows platform with Windows Command Line?

If so, this might be a known issue in #55, which is a bug of Windows console, and it seems like python3.3 supports a solution.

If you are running python3.3, please try to run the command in the following answer before running edx-dl. http://stackoverflow.com/questions/388490/unicode-characters-in-windows-command-line-how I am not sure if it would work since I am not working in python3.3 environment, and I would like to get feedback if you are using this environment.

maxdd commented 10 years ago

im on python2.7 , i tried the method there but not it stops way before with

Traceback (most recent call last): File "C:\Python27\edx-downloader\edx-dl.py", line 403, in main() File "C:\Python27\edx-downloader\edx-dl.py", line 265, in main print('Welcome %s' % USERNAME) LookupError: unknown encoding: cp65001

shk3 commented 10 years ago

@maxdd, it's right. Python 2.7 doesn't support it. I believe Python starts to support it since python 3.3.

crypdick commented 10 years ago

@shk3 Sorry for the late response, Internet is scarce in Myanmar...

I am on OS X 10.6.8 and and using Python 2.7.3

crypdick commented 10 years ago

Is there any way that we can make it so that if it runs into this error it can skip that specific video and continue with the script?

maxdd commented 10 years ago

well in my case it doesnt even start to download, could you post the error it gives you?

crypdick commented 10 years ago

I can't actually make it that far in the script anymore due to a new issue (issue #92) but I posted my error here https://github.com/shk3/edx-downloader/pull/55#issuecomment-35151417

shk3 commented 10 years ago

@maxdd and @isomerase, could you take a look at #99 to see if it solves the problem?

snoww0lf commented 9 years ago

try to use .encode('ascii', 'ignore') , but it just throws and won't show the non-ascii characters.

iemejia commented 9 years ago

I'm closing this one since it seems to be the same than issue #126 and we are continuing the discussion there. I also just pushed a PR that seems to solve the problem. Please report in case you have issues.