bda-research / node-crawler

Web Crawler/Spider for NodeJS + server-side jQuery ;-)
MIT License
6.69k stars 876 forks source link

Crash ERR_INVALID_URL on Redirect #401

Closed Ereaey closed 2 years ago

Ereaey commented 2 years ago

While in use, I have several crawlers which occasionally crash with the same error.

The problem I think is a redirection encoding problem, I would like to at least be able to catch this error. Do you have a solution ? Thanks

node:url:424 throw new ERR_INVALID_URL(url); ^

TypeError [ERR_INVALID_URL]: Invalid URL at new NodeError (node:internal/errors:371:5) at Url.parse (node:url:424:15) at Object.urlParse [as parse] (node:url:157:13) at Redirect.onResponse (/home/pappers/crawler/node_modules/request/lib/redirect.js:108:21) at Request.onRequestResponse (/home/pappers/crawler/node_modules/request/request.js:986:22) at ClientRequest.emit (node:events:390:28) at HTTPParser.parserOnIncomingClient [as onIncoming] (node:_http_client:623:27) at HTTPParser.parserOnHeadersComplete (node:_http_common:128:17) at HTTPParser.execute () at Socket.socketOnData (node:_http_client:487:22) { input: 'https://www.réduire-mes-impôts.com/', code: 'ERR_INVALID_URL' }

mike442144 commented 2 years ago

Non-related, I shall close, reopen please if further update.