Open wkingnet opened 2 years ago
If the URL contains UNICODE encoding, python will report an error.
debug info:
INFO:root:Crawling #1: https://gvo.wiki/html/NPC掉落書籍.html DEBUG:root:https://gvo.wiki/html/NPC掉落書籍.html ==> 'ascii' codec can't encode characters in position 13-16: ordinal no t in range(128)
Solution:
edit crawler.py Add the following code at the top
import string from urllib.parse import unquote
then search current_url = self.urls_to_crawl.pop()
current_url = self.urls_to_crawl.pop()
add a line below
current_url = self.urls_to_crawl.pop() current_url = quote(current_url, safe=string.printable)
If the URL contains UNICODE encoding, python will report an error.
debug info:
Solution:
edit crawler.py Add the following code at the top
then search
current_url = self.urls_to_crawl.pop()
add a line below