iawia002 / Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾
MIT License
817 stars 140 forks source link

下载B站视频 仍然出现Http Error 466错误 #46

Closed oseansdeep closed 6 years ago

oseansdeep commented 6 years ago

在本地使用lulu可以下载,包括youtube-dl you-get 都可以下载B站视频; 但是在线上服务器,仍然是466,youtube-dl you-get lulu都是这个问题,使用的是阿里云服务器; 是不是aliyun那边封禁了??

iawia002 commented 6 years ago

有没有 --debug 的信息,贴上来看一下

oseansdeep commented 6 years ago

[DEBUG] url_locations: http://www.bilibili.com/video/av19052070/ [DEBUG] get_content: https://www.bilibili.com/video/av19052070/ [DEBUG] get_content: http://interface.bilibili.com/playurl?cid=31079734&player=1&quality=4&ts=1517817636&sign=9e6eda4cf9f89d59215690dd5aa0455d [DEBUG] get_content: http://interface.bilibili.com/playurl?cid=31079734&player=1&quality=3&ts=1517817636&sign=6e78821fb5acec17cc1141b99ddcd356 [DEBUG] get_content: http://interface.bilibili.com/playurl?cid=31079734&player=1&quality=2&ts=1517817636&sign=9b6f943ba0fa749a2943651072d00f0c [DEBUG] get_content: http://interface.bilibili.com/playurl?cid=31079734&player=1&quality=1&ts=1517817636&sign=e90cfe83b61bf199a38616b8cb47aa32 [DEBUG] get_content: http://interface.bilibili.com/playurl?cid=31079734&player=1&quality=0&ts=1517817636&sign=a5d3a0933fd18367042732f38e4a617b [DEBUG] get_content: http://comment.bilibili.com/31079734.xml It seems that your ffmpeg is a nightly build. Please switch to the latest stable if merging failed. site: Bilibili title: 【波澜哥】本草纲目 stream:

Downloading 【波澜哥】本草纲目.mp4 ... 0.0% ( 0.0/ 7.2MB) ├──────────────────────────────────────────────────────────────────────────────────────────┤[1/1] [DEBUG] HTTP Error with code466 [DEBUG] HTTP Error with code466 [DEBUG] HTTP Error with code466 lulu: version 0.3.0, a tiny downloader that scrapes the web. lulu: Namespace(URL=['www.bilibili.com/video/av19052070/'], cookies=None, debug=True, extractor_proxy=None, force=False, format='mp4', help=False, http_proxy=None, info=False, input_file=None, itag=None, json=False, no_caption=False, no_merge=False, no_proxy=False, output_dir='.', output_filename=None, password=None, player=None, playlist=False, socks_proxy=None, stream=None, timeout=600, url=False, version=False) Traceback (most recent call last): File "/usr/bin/lulu", line 11, in sys.exit(main()) File "/usr/local/python3/lib/python3.5/site-packages/lulu/main.py", line 92, in main main(kwargs) File "/usr/local/python3/lib/python3.5/site-packages/lulu/common.py", line 1483, in main script_main(any_download, any_download_playlist, kwargs) File "/usr/local/python3/lib/python3.5/site-packages/lulu/common.py", line 1373, in script_main extra File "/usr/local/python3/lib/python3.5/site-packages/lulu/common.py", line 1119, in download_main download(url, kwargs) File "/usr/local/python3/lib/python3.5/site-packages/lulu/common.py", line 1474, in any_download m.download(url, kwargs) File "/usr/local/python3/lib/python3.5/site-packages/lulu/extractor.py", line 85, in download_by_url self.download(kwargs) File "/usr/local/python3/lib/python3.5/site-packages/lulu/extractor.py", line 290, in download av=stream_id in self.dash_streams File "/usr/local/python3/lib/python3.5/site-packages/lulu/common.py", line 796, in download_urls headers=headers, *kwargs File "/usr/local/python3/lib/python3.5/site-packages/lulu/common.py", line 499, in url_save file_size = url_size(url, faker=faker, headers=tmp_headers) File "/usr/local/python3/lib/python3.5/site-packages/lulu/common.py", line 384, in url_size response = urlopen_with_retry(request.Request(url, headers=headers)) File "/usr/local/python3/lib/python3.5/site-packages/lulu/common.py", line 289, in urlopen_with_retry raise http_error File "/usr/local/python3/lib/python3.5/site-packages/lulu/common.py", line 280, in urlopen_with_retry return request.urlopen(args, *kwargs) File "/usr/local/python3/lib/python3.5/urllib/request.py", line 163, in urlopen return opener.open(url, data, timeout) File "/usr/local/python3/lib/python3.5/urllib/request.py", line 472, in open response = meth(req, response) File "/usr/local/python3/lib/python3.5/urllib/request.py", line 582, in http_response 'http', request, response, code, msg, hdrs) File "/usr/local/python3/lib/python3.5/urllib/request.py", line 510, in error return self._call_chain(args) File "/usr/local/python3/lib/python3.5/urllib/request.py", line 444, in _call_chain result = func(*args) File "/usr/local/python3/lib/python3.5/urllib/request.py", line 590, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 466:

iawia002 commented 6 years ago

我觉得不是 IP 被 Ban 了,应该是 headers 里面没有 User-Agent 导致的,但是正常走代码不会出现这种情况,我本地又复现不了,不好调试

你会调试的话可以 debug 一下,在 lulu/extractor.py 292 行 download_urls 函数前面加个断点,看一下 headers 到底怎么回事

oseansdeep commented 6 years ago

在 download_urls 前面加了个 print("headers:%s"%(headers)),结果是:

headers:{'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:51.0) Gecko/20100101 Firefox/51.0', 'Accept-Encoding': 'gzip,deflate,sdch', 'Accept-Language': 'en-US,en;q=0.8', 'Referer': 'https://www.bilibili.com/video/av19052070/', 'Accept-Charset': 'UTF-8,*;q=0.5'}

看起来是有的

iawia002 commented 6 years ago

那这个可能真的是 IP 被 ban 了之类的,我也想不出什么其它原因了

oseansdeep commented 6 years ago

看来就是ip被feng了,换了代理,就解决了

iawia002 commented 6 years ago

😂 你服务器为什么被 B 站封了?不知道它是根据什么规则来封的

mashirozx commented 6 years ago

同样的错误😂 vultr的主机,终于也被bilibili干掉了呢,之前用you-get显示的则是403

ClericPy commented 6 years ago

前几年就有的问题, 云主机 IP 基本上都被 ban, 阿里云全线崩塌, 不过我不是用 you-get 和 lulu的 现在有人绕开的么 封禁 IP 拿到的 mp4 都没法看, 挺烦的

iawia002 commented 6 years ago

IP 被封了应该是数据都不会返的,挂代理吧

xiiatuuo commented 6 years ago

@oseansdeep 换的啥代理呀?

ClericPy commented 6 years ago

他们封IP 的规则挺有意思的, 主站是不封的, 但是解析出来的mp4地址的cdn是无法访问的(通过 stream类型的requests请求, 带移动端UA可以下载的表示可以访问), 这种放到浏览器上测试的结果就是, 页面渲染完了, 视频播放失败, 虽然有个mp4地址 最近挂代理拿到的不封的cdn 网址格式 http://ws.acgvideo.com/5/8f/preview_30667327-1.mp4?wsTime=1518185271&platform=html5&wsSecret2=039704b84c4e63c057640fc3c47d7435&oi=3722931496&stime=0&etime=360&rate=65 而lulu拿到的mp4地址大都是 http://cn-nmghhht-cu-v-07.acgvideo.com/vg3/upgcxcode/11/47/29864711/29864711-1-16.mp4?expires=1518115200&platform=pc&ssig=HTroigKypra49iudaIipgw&oi=659282845&nfa=6t3I9sK0FQyMdqgkJKymGg==&dynamic=1&hfa=2018467011&hfb=Yjk5ZmZjM2M1YzY4ZjAwYTMzMTIzYmIyNWY4ODJkNWI=

huochaitny commented 6 years ago

3月7日,HTTP Error 466 错误出现,上周还没有问题,今天突然就有了,youget也是同样的错误

ClericPy commented 6 years ago

现在想拿手机上能直接播的mp4 url基本上没戏了, lulu 解析拿到的mp4手机上没法用 此外, 今天用Lulu 下载 MP4 居然成功了, 然后手贱升级了下lulu, 就又不行了...

Justsoos commented 6 years ago

用这个方法试试:https://github.com/iawia002/Lulu/issues/54#issuecomment-370230021 B站扒出来的流早已经不能随便用播放器放了,会被各种4xx,5xx http错误覆盖。 B站这个250没封住周鸿祎,把用户都给封了。真是神经

Justsoos commented 6 years ago

刚测了下,楼上方法也不行了。看来B站每天都在改封锁方法。他们的技术已经疯了吧。。。。 另外,看似是国内cdn 给这个地址,浏览器播放没有任何问题: https://cn-nmghhht-cu-v-10.acgvideo.com/vg1/upgcxcode/34/97/31079734/31079734-1-80.flv?expires=1520375100&platform=pc&ssig=TnLE3YB0v_jWSSOzm1knYg&oi=3708391536&nfa=UEav3bCY+XfNyw5l/+27kw==&dynamic=1&hfa=2024659050&hfb=Yjk5ZmZjM2M1YzY4ZjAwYTMzMTIzYmIyNWY4ODJkNWI= 国外会给这个地址开头,估计是腾讯海外云:http://tx.acgvideo.com/

Justsoos commented 6 years ago

好消息是,国内这个cdn,用 idm 下载完全没问题。

Justsoos commented 6 years ago

楼上各位可以去用 annie 试试了。https://github.com/iawia002/annie 可用。好用。

同时,修改代码测试 了annie : https://github.com/iawia002/annie/blob/41691ca0f4ec3d9845ce3e3e5424a5837a251e70/downloader/downloader.go#L81 在 annie 这里输出 header

    defer file.Close()
    fmt.Println(headers, urlData.URL)
    res := request.Request("GET", urlData.URL, nil, headers)

下面是内容: [Referer:https://www.bilibili.com/video/av20383055/] http://cn-nmghhht-cu-v-08.acgvideo.com/vg2/upgcxcode/11/07/33310711/33310711-2-64.flv?expires=1520382000&platform=pc&ssig=-mDclQagDcjiptbU2OKdQw&oi=3708391536&nfa=0IC9+c6W2KQLwirHaGDLFA==&dynamic=1&hfb=Yjk5ZmZjM2M1YzY4ZjAwYTMzMTIzYmIyNWY4ODJkNWI= 但没发现什么奇怪的啊。就一个 referer。但为什么 annie 下载 B 站相当快,而且没错,但 python 的 lulu 被 b 站怼的死去活来,就是4xx 错误,不行呢?

  File "/usr/local/lib/python3.5/dist-packages/lulu/common.py", line 235, in urlopen_with_retry
    raise http_error
  File "/usr/local/lib/python3.5/dist-packages/lulu/common.py", line 226, in urlopen_with_retry
    return request.urlopen(*args, context=context, **kwargs)
……
  File "/usr/lib/python3.5/urllib/request.py", line 590, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 466:
iawia002 commented 6 years ago

@Justsoos annie 用的 api 和 lulu 的不一样,可能解析出来的 CDN 地址不一样

Justsoos commented 6 years ago

B 站这搞得鸡飞狗跳,原来是换 api 了,加 v2 就 OK 了。you-get 刚更新。可用 https://github.com/soimort/you-get/commit/1900f7608cc2756d5460c99eb792c8e0eb42e7f4