Fix HTTP 406 errors in scraping tests

I encountered an issue while running the test suite for the html-metadata project. Two tests in the scraping.js file were failing with HTTP 406 errors (npm test):

The "nested Twitter data from www.theguardian.com" test in the parseTwitter function The "should return an object or array and get correct data" test for The Guardian URL in the parseJsonLd function

These errors were causing the test suite to fail: 1) scraping parseTwitter function nested Twitter data from www.theguardian.com: HTTPError: 406: http_error

2) scraping parseJsonLd function https://www.theguardian.com/commentisfree/2024/mar/08/the-guardian-view-on-wikipedias-female-volunteers-a-hive-heroism-that-changes-history should return an object or array and get correct data: HTTPError: 406: http_error

Problem: The issue appears to be related to the User-Agent and Accept headers being sent with the HTTP requests. Some websites, including The Guardian, seem to be rejecting requests with the default headers used by the preq library.

wikimedia / html-metadata

Fix HTTP 406 errors in scraping tests #95