Closed idMysteries closed 1 year ago
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://dragontea.ink/?s=the&post_type=wp-manga
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://www.lightnovelpub.com//search?title=the
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 451 Client Error: Unavailable For Legal Reasons for url: https://light-novel.online/search.ajax?query=the
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 451 Client Error: Unavailable For Legal Reasons for url: http://ww38.lightnovel.tv/?s=the&post_type=wp-manga&author=&artist=&release=
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://novelfullplus.com/ajax/search?q=the
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://www.novelpub.com//search?title=the
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://readnovelfull.com/search?keyword=th
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://wuxiaworld.live/search.ajax?type=&query=the
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://es.mtlnovel.com//wp-admin/admin-ajax.php?action=autosuggest&q=the
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://www.mywuxiaworld.com/search/result.html?searchkey=the
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://docln.net//tim-kiem-nang-cao?title=the
2022-09-23 10:51:36,508 [DEBUG] (lncrawl.core.crawler)
HTTPSConnectionPool(host='truyentr.pro', port=443): Max retries exceeded with url: /?s=the (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x0000000016CFFEB0>: Failed to establish a new connection: [Errno 11004] getaddrinfo failed')) | Retrying...
2022-09-23 10:51:36,509 [DEBUG] (lncrawl.core.crawler)
[GET] https://truyentr.info/?s=the
truyentr.info redirects to truyentr.pro -> error
HTTPSConnectionPool(host='asadatranslations.com', port=443): Max retries exceeded with url: /?s=the&post_type=wp-manga&author=&artist=&release= (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x0000000008F5E8B0>: Failed to establish a new connection: [Errno 11004] getaddrinfo failed')) | Retrying...
Theses souces use cloudflare, perhaps it is the issue ? https://888novel.com/ -> cloudflare issue https://kissmanga.in/ https://clicknovel.net/ https://mtlreader.com/ https://wuxiaworld.io/ https://light-novel.online/ https://readnovelfull.com/ https://wuxiaworld.live/ https://es.mtlnovel.com/
And some seems not to work because of the double // : https://docln.net // tim-kiem-nang-cao?title=the -> error https://docln.net / tim-kiem-nang-cao?title=the -> work
I do not know how many of these issues are fixed. It is hard to track if all are posted under one issue. I am closing this for now. Please report source issues separately for each sites.
I looked at the log file and found a lot of broken sources.
Connection to wuxiaworld.co timed out. wuxiaworld.co timed out, but m.wuxiaworld.co works! home_url wuxiaworld.co -> m.wuxiaworld.co?
Connection to novelcrush.com timed out. The crawler needs to be moved to _down
Connection to readnovelz.net timed out. The site does not redirect. Someone bought this domain. move to _down
Tests needed: