wikimedia / html-metadata

MetaData html scraper and parser for Node.js (supports Promises and callback style)
MIT License
163 stars 44 forks source link

FIxing HTTP 406 error while runing the test suite (npm test) #96

Closed Jacobojijo closed 1 month ago

Jacobojijo commented 1 month ago

This is to fix issue #95

Solution

I modified the scraping.js file to use more browser-like User-Agent and Accept headers. Here are the key changes:

  1. Added constants for user agent and accept header:

    const userAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36';
    const acceptHeader = 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8';
    
    * Created a `getWithHeaders` function to make requests with these headers:
    
    ```javascript
    function getWithHeaders(url) {
        return preq.get({
            uri: url,
            headers: {
                'User-Agent': userAgent,
                'Accept': acceptHeader
            }
        });
    }
Jacobojijo commented 1 month ago

Hello @mvolz, I have made the requested changes to the pull request. You can check and if there is still anything I should do, please inform me

Jacobojijo commented 1 month ago

Hello @mvolz would you please do the PR review for the changes made ?

Jacobojijo commented 1 month ago

@mvolz, can you review my PR? I also think a styling lint should be added to help in other future changes. I'm thinking of introducing one to this project. What do you think?

Jacobojijo commented 1 month ago

@mvolz, I guess this PR is done