ministep / SQL_DataAnalysis

SQL数据分析
9 stars 0 forks source link

node.js - How to scrape JSON from puppeteer? - Stack Overflow #95

Open kemistep opened 4 years ago

kemistep commented 4 years ago

I login to a site and it gives a browser cookie.

I go to a URL and it is a json response.

How do I scrape the page after entering await page.goto('blahblahblah.json'); ?

asked Jan 29 '18 at 22:54

[

](https://stackoverflow.com/users/9278676/amy-coin)

Amy CoinAmy Coin

611 gold badge1 silver badge3 bronze badges

Another way which doesn't give you intermittent issues is to evaluate the body when it becomes available and return it as JSON e.g.

const puppeteer = require('puppeteer'); 

async function run() {

    const browser = await puppeteer.launch( {
        headless: false  
    }); 

    const page = await browser.newPage(); 

    await page.goto('https://raw.githubusercontent.com/GoogleChrome/puppeteer/master/package.json');

    var content = await page.content(); 

    innerText = await page.evaluate(() =>  {
        return JSON.parse(document.querySelector("body").innerText); 
    }); 

    console.log("innerText now contains the JSON");
    console.log(innerText);

    await browser.close(); 

};

run(); 

answered Jan 30 '18 at 19:55

[

](https://stackoverflow.com/users/15410/rippo)

RippoRippo

19.9k13 gold badges67 silver badges111 bronze badges

You can intercept the network response, like this:

const puppeteer = require('puppeteer');
const fs = require('fs');
(async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  page.on('response', async response => {
    console.log('got response', response._url)
    const data = await response.buffer()
    fs.writeFileSync('/tmp/response.json', data)
  })
  await page.goto('https://raw.githubusercontent.com/GoogleChrome/puppeteer/master/package.json', {waitUntil: 'networkidle0'})
  await browser.close()
})() 

answered Jan 30 '18 at 8:21

[

](https://stackoverflow.com/users/504811/pasi)

PasiPasi

1,93212 silver badges13 bronze badges

Not the answer you're looking for? Browse other questions tagged node.js scrape puppeteer or ask your own question.


https://stackoverflow.com/questions/48511357/how-to-scrape-json-from-puppeteer