iambus / youku-lixian

优酷下载脚本,顺便支持其他……
Other
849 stars 265 forks source link

UnicodeEncodeError: 'charmap' codec can't encode characters #7

Open pengkui opened 12 years ago

pengkui commented 12 years ago

R:\DOWNLOADS\iambus>python bilibili.py http://www.bilibili.tv/video/av204442/index_3.html Traceback (most recent call last): File "bilibili.py", line 73, in main() File "bilibili.py", line 70, in main script_main('bilibili', bilibili_download) File "R:\DOWNLOADS\iambus\common.py", line 232, in script_main download(url) File "bilibili.py", line 57, in bilibili_download iask_download_by_id(id, title) File "R:\DOWNLOADS\iambus\iask.py", line 16, in iask_download_by_id download_urls(urls, title, 'flv', total_size=None) File "R:\DOWNLOADS\iambus\common.py", line 188, in download_urls print 'Downloading %s.%s ...' % (title, ext) File "C:\Python27\lib\encodings\cp437.py", line 12, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode characters in position 12-21: character maps to

OS: win7 64bit

pengkui commented 12 years ago

各文件头加了句

-- coding: utf-8 --

然后把那个print语句注释掉,结果:

R:\DOWNLOADS>python .\iambus\bilibili.py http://www.bilibili.tv/video/av204442/index_2.html 45%[=================== ] 3/4Traceback (most recent call last): File ".\iambus\bilibili.py", line 74, in main() File ".\iambus\bilibili.py", line 71, in main script_main('bilibili', bilibili_download) File "R:\DOWNLOADS\iambus\common.py", line 234, in script_main download(url) File ".\iambus\bilibili.py", line 58, in bilibili_download iask_download_by_id(id, title) File "R:\DOWNLOADS\iambus\iask.py", line 17, in iask_download_by_id download_urls(urls, title, 'flv', total_size=None) File "R:\DOWNLOADS\iambus\common.py", line 197, in download_urls url_save(url, filepath, bar) File "R:\DOWNLOADS\iambus\common.py", line 80, in url_save assert received == file_size == os.path.getsize(filepath) AssertionError

貌似是第三小段下载出了问题

不少人在bilibili在线观看也有相关的感觉:如果不等整个进度条全满而开始观看的话, 视频有可能从6分钟跳跃到12分钟,或者从12分到18分……

pengkui commented 12 years ago

擦……重新运行竟然错误也随机

R:\DOWNLOADS>python .\iambus\bilibili.py http://www.bilibili.tv/video/av204442/index_2.html 24%[========== ] 2/4Traceback (most recent call last): File ".\iambus\bilibili.py", line 74, in main() File ".\iambus\bilibili.py", line 71, in main script_main('bilibili', bilibili_download) File "R:\DOWNLOADS\iambus\common.py", line 234, in script_main download(url) File ".\iambus\bilibili.py", line 58, in bilibili_download iask_download_by_id(id, title) File "R:\DOWNLOADS\iambus\iask.py", line 17, in iask_download_by_id download_urls(urls, title, 'flv', total_size=None) File "R:\DOWNLOADS\iambus\common.py", line 197, in download_urls url_save(url, filepath, bar) File "R:\DOWNLOADS\iambus\common.py", line 73, in url_save buffer = response.read(1024*256) File "C:\Python27\lib\socket.py", line 380, in read data = self._sock.recv(left) File "C:\Python27\lib\httplib.py", line 561, in read s = self.fp.read(amt) File "C:\Python27\lib\socket.py", line 380, in read data = self._sock.recv(left) socket.error: [Errno 10054] An existing connection was forcibly closed by the remote host

iambus commented 12 years ago

第一个问题不清楚,没遇到过。你的系统是什么语言的?怎么编码会是cp437? 后两个应该都是网络问题。现在没有重试和断点续传的功能。有时间可能会改进下。

pengkui commented 12 years ago

英文版win7 64位。

大致看了一下,你对title做了utf-8处理,但是没有指定源代码里面字符串的编码。其实每个源文件前加一句用"# -- coding: utf-8 --"指定一下,就OK了。

我猜print 'Downloading %s.%s ...' % (title, ext) 一句里,title已经是utf-8,但ext=='flv‘不一定是,所以这种写法print有可能报错(跟OS的默认编码设置也有关)。

pengkui commented 12 years ago

断点其实就不用搞了。超时重试抓异常、加个循环应该就可以。

iambus commented 12 years ago

-- coding: utf-8 --对这个问题应该不起作用吧……