laurengarcia / url-metadata

NPM module: Request a url and scrape the metadata from its HTML using Node.js or the browser.
https://www.npmjs.com/package/url-metadata
MIT License
166 stars 43 forks source link

Some Urls, e.g. BBC website articles, are returning 403 Http status on url-metadata versions after 2.5.0 #87

Closed openminded-oscar closed 4 months ago

openminded-oscar commented 4 months ago

The urls like https://www.bbc.com/news/world-us-canada-68987314 was working with url-metadata v2.5.0 and returning page preview.

Looks like the issue can be related with deprecated npm 'request' http client replacing. After that it is getting 403 response status.

How do you think, can we do something with it?

laurengarcia commented 4 months ago

The headers changed, so you can try changing those to match what was going over the wire in 2.5.0 I don't have this issue, but i work from an ip address that does not query those urls over and over. You can try using a VPN, that might help. There's nothing else i can do here.