Mercury Parser encountered a problem trying to parse that resource.
Error: ESOCKETTIMEDOUT
at ClientRequest.<anonymous> (/home/thoraxe/node_modules/postman-request/request.js:1094:19)
at Object.onceWrapper (events.js:421:28)
at ClientRequest.emit (events.js:315:20)
at TLSSocket.emitRequestTimeout (_http_client.js:784:9)
at Object.onceWrapper (events.js:421:28)
at TLSSocket.emit (events.js:327:22)
at TLSSocket.Socket._onTimeout (net.js:483:8)
at listOnTimeout (internal/timers.js:554:17)
at processTimers (internal/timers.js:497:7) {
code: 'ESOCKETTIMEDOUT',
connect: false
}
it appears that nasdaq may somehow be identifying us as a "non browser" and prohibiting the page from being loaded. curl gets an "Access denied'. Lynx never loads the page. The link definitely works.
Possible Solution
Not sure what the nasdaq server is doing to "identify" that we're not a real browser, but it's definitely not working. I also tried with spoofing a user agent:
~/node_modules/.bin/mercury-parser --header.User-Agent="Mozilla/5.0 (iPhone; CPU iPhone OS 10_3_1 like Mac OS X) AppleWebKit/603.1.30 (KHTML, like Gecko) Version/10.0 Mobile/14E304 Safari/602.1" https://www.nasdaq.com/articles/voyager-digital-reports-75-revenue-rise-in-q4-cites-increased-crypto-adoption-2021-01-05
~/node_modules/.bin/mercury-parser https://www.nasdaq.com/articles/voyager-digital-reports-75-revenue-rise-in-q4-cites-increased-crypto-adoption-2021-01-05
Expected Behavior
should extract the webpage
Current Behavior
Steps to Reproduce
~/node_modules/.bin/mercury-parser https://www.nasdaq.com/articles/voyager-digital-reports-75-revenue-rise-in-q4-cites-increased-crypto-adoption-2021-01-05
Detailed Description
it appears that nasdaq may somehow be identifying us as a "non browser" and prohibiting the page from being loaded. curl gets an "Access denied'. Lynx never loads the page. The link definitely works.
Possible Solution
Not sure what the nasdaq server is doing to "identify" that we're not a real browser, but it's definitely not working. I also tried with spoofing a user agent:
~/node_modules/.bin/mercury-parser --header.User-Agent="Mozilla/5.0 (iPhone; CPU iPhone OS 10_3_1 like Mac OS X) AppleWebKit/603.1.30 (KHTML, like Gecko) Version/10.0 Mobile/14E304 Safari/602.1" https://www.nasdaq.com/articles/voyager-digital-reports-75-revenue-rise-in-q4-cites-increased-crypto-adoption-2021-01-05
I got the same timeout.