Open 2837linlinlin opened 3 years ago
node_vm2.VMError: setInterval is not defined 是不是改规则了
我現在會拿到 403 錯誤(2021/8/18),看起來是被 Cloudflare 擋掉了。
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://www.cocomanhua.com/15554/
setInterval
錯誤的參考解法︰https://github.com/eight04/ComicCrawler/issues/294#issuecomment-813003020
沒試出requests避開Cloudflare的方法
使用BeautifulSoup+selenium可以讓python訪問cocomanga 取得畫面目錄頁面的html, ex:https://www.cocomanhua.com/15335/ 但因為html有變, oh.py的get_episodes要改一下 但get_images還是會在imgs = eval(code)出錯 即使成功組出image的url, 直接access還是會被擋掉, 還不知道怎樣才能直接access圖片url ex: https://img.cocomanga.com/comic/15335/**RnpUVS9ucGdsRlhkN3ZXbklDeWhBbE5kVERsaFo5TUJ5M3JQdFhXTVQxMD0**=/0001.jpg https://img.cocomanga.com/comic/15335/RnpUVS9ucGdsRlhkN3ZXbklDeWhBbE5kVERsaFo5TUJ5M3JQdFhXTVQxMD0=/0003.jpg https://img.cocomanga.com/comic/15335/RnpUVS9ucGdsRlhkN3ZXbklDeWhBbE5kVERsaFo5TUJ5M3JQdFhXTVQxMD0=/0005.jpg RnpUVS9ucGdsRlhkN3ZXbklDeWhBbE5kVERsaFo5TUJ5M3JQdFhXTVQxMD0 這看起來是動態產生的
from bs4 import BeautifulSoup from selenium import webdriver browser = webdriver.Chrome('chromedriver') browser.get(url) soup = BeautifulSoup(browser.page_source, 'html.parser') html_content = str(soup) browser.close()
for match in re.finditer(r'href="([^"]+)" title="([^"]+)', html): ep_url, title = match.groups()
Start analyzing https://www.cocomanhua.com/15554/ Analyzing success! Start downloading 最强的魔导士,膝盖中了一箭之后成为乡下的卫兵 total 12 episode. Downloading ep 第1话 最强的魔导士隐居起来 Traceback (most recent call last): File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\crawler.py", line 338, in error_loop process() File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\crawler.py", line 289, in download crawler.init() File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\crawler.py", line 51, in init self.init_images(self.ep.current_page - 1) File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\crawler.py", line 58, in init_images self.get_images() File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\crawler.py", line 188, in get_images self.ep.current_url File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\mods\oh.py", line 115, in get_images imgs = eval(code) File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\node_vm2__init.py", line 28, in eval return vm.run(code) File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\node_vm2__init.py", line 131, in run return self.communicate({"action": "run", "code": code}) File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\node_vm2\init__.py", line 101, in communicate raise VMError(data["error"]) node_vm2.VMError: setInterval is not defined Traceback (most recent call last): File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\crawler.py", line 338, in error_loop process() File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\crawler.py", line 289, in download crawler.init() File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\crawler.py", line 51, in init self.init_images(self.ep.current_page - 1) File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\crawler.py", line 58, in init_images self.get_images() File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\crawler.py", line 188, in get_images self.ep.current_url File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\comiccrawler\mods\oh.py", line 115, in get_images imgs = eval(code) File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\node_vm2\init.py", line 28, in eval return vm.run(code) File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\node_vm2\init.py", line 131, in run return self.communicate({"action": "run", "code": code}) File "C:\Users\gao\PycharmProjects\mh_jiaoben\venv\lib\site-packages\node_vm2\init__.py", line 101, in communicate raise VMError(data["error"]) node_vm2.VMError: setInterval is not defined