ryanmichaelhirst / instagram-scraper

11 stars 2 forks source link

Unhandled promise rejection warning: Timeoutrerror #2

Open Brlaney opened 4 years ago

Brlaney commented 4 years ago

Hello, I'm enjoying messing with this repo & believe I almost have it up and running on my setup. I'm using a windows 10 OS and running npm on my windows powershell (x86).

I'm fairly new to programming, especially javascript. Here is what my command line output:

// Begin error-message (node:14348) UnhandledPromiseRejectionWarning: TimeoutError: waiting for selector "div[role="dialog"] > div:nth-child(2) > ul" failed: timeout 30000ms exceeded at new WaitTask (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\DOMWorld.js:388:34) at DOMWorld._waitForSelectorOrXPath (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\DOMWorld.js:313:26) at DOMWorld.waitForSelector (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\DOMWorld.js:296:21) at Frame.waitForSelector (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\FrameManager.js:384:51) at Frame. (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\helper.js:95:27) at Page.waitForSelector (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\Page.js:778:33) at script (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\server\script.js:31:16) at async C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\server\index.js:18:18 (node:14348) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag --unhandled-rejections=strict (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1) (node:14348) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code. // End error-message

Any insight into my issue would be greatly appreciated. This is everything I've done so far:

1.) entered into a new empty directory 2.) git clone "your repo address" & edited your login credentials to mine in the script.js file 3.) npm i package.json 4.) npm run build 5.) npm run dev 6.) went to localhost --> everything OK so far 7.) typed a username to obtain followers from 8.) headless browser popped-up, logged-in just fine, and redirected to profile of the username I entered just fine 9.) Then the followers container popped-up in the headless browser and nothing else happened and this is the point where the above error was output into my terminal

Thanks again, SIncerely, BRL

ryanmichaelhirst commented 4 years ago

Hey Brendan I'm glad you enjoyed the video!

So it looks like Instagram changed the html of their page, so the code isn't working properly anymore.

I attached a file below with some updated paths to get the script to run properly, but it's not capturing any of the followers / who the person is following.

To get that to work you're gonna have to open up the console and inspect the webpage to get the correct paths.

Hope this helps!

On Mon, Jul 27, 2020 at 5:42 PM Brendan_webdev notifications@github.com wrote:

Hello, I'm enjoying messing with this repo & believe I almost have it up and running on my setup. I'm using a windows 10 OS and running npm on my windows powershell (x86).

I'm fairly new to programming, especially javascript. Here is what my command line output:

// Begin error-message (node:14348) UnhandledPromiseRejectionWarning: TimeoutError: waiting for selector "div[role="dialog"] > div:nth-child(2) > ul" failed: timeout 30000ms exceeded at new WaitTask (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\DOMWorld.js:388:34) at DOMWorld._waitForSelectorOrXPath (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\DOMWorld.js:313:26) at DOMWorld.waitForSelector (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\DOMWorld.js:296:21) at Frame.waitForSelector (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\FrameManager.js:384:51) at Frame. (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\helper.js:95:27) at Page.waitForSelector (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\node_modules\puppeteer\lib\Page.js:778:33) at script (C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\server\script.js:31:16) at async C:\edited_path_for_security\new-instagram-scraper\new-instagram-scraper\server\index.js:18:18 (node:14348) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag --unhandled-rejections=strict (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1) (node:14348) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code. // End error-message

Any insight into my issue would be greatly appreciated. This is everything I've done so far:

1.) entered into a new empty directory 2.) git clone "your repo address" & edited your login credentials to mine in the script.js file 3.) npm i package.json 4.) npm run build 5.) npm run dev 6.) went to localhost --> everything OK so far 7.) typed a username to obtain followers from 8.) headless browser popped-up, logged-in just fine, and redirected to profile of the username I entered just fine 9.) Then the followers container popped-up in the headless browser and nothing else happened and this is the point where the above error was output into my terminal

Thanks again, SIncerely, BRL

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rmbh4211995/instagram-scraper/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACVSXCYQIMMIJ47FTEGJQSTR5XYENANCNFSM4PJKSW4Q .

const puppeteer = require('puppeteer');

const script = async (username) => { const browser = await puppeteer.launch({ args: [ '--incognito', ], headless: false }); const page = await browser.newPage(); await page.goto('https://www.instagram.com/accounts/login', { waitUntil: "networkidle2" }); await page.type('input[name=username]', 'jessiejames12345678', { delay: 20 }); await page.type('input[name=password]', 'adminpassword1', { delay: 20 }); await page.click('button[type=submit]', { delay: 20 }); await page.waitFor(5000)

const notifyBtns = await page.$x("//button[contains(text(), 'Not Now')]");
if (notifyBtns.length > 0) {
    await notifyBtns[0].click();
} else {
    console.log("No notification buttons to click.");
}
await page.goto(`https://www.instagram.com/${username}`, { waitUntil: "networkidle2" });
// await page.click('a[href="/rmbhh/"]');
await page.waitFor(2000);
const followersBtn = await page.$('div[id=react-root] > section > main > div > header > section > ul > li:nth-child(2) > a');
await followersBtn.evaluate(btn => btn.click());

await page.waitFor(3000);
const followersDialog = 'div[role="dialog"] > div > div:nth-child(2)';
await page.waitForSelector('div[role="dialog"] > div > div:nth-child(2) > ul');
await scrollDown(followersDialog, page);

console.log("getting followers");
const list1 = await page.$$('div[role="dialog"] > div > div:nth-child(2) > ul > div > li > div > div > div:nth-child(2) > div > a');
let avatarPaths = [
    'div[role="dialog"] > div > div:nth-child(2) > ul > div > li > div > div > div > a > img',
    'div[role="dialog"] > div > div:nth-child(2) > ul > div > li > div > div > div > span > img'
];
const pics1 = await avatarPaths.reduce(async (accProm, path) => {
    const acc = await accProm;
    const arr = await page.$$eval(path, res => {
        return res.map(pic => {
            const alt = pic.getAttribute('alt');
            const strings = alt.split(/(['])/g);
            return {
                username: strings[0],
                avatar: pic.getAttribute('src')
            }
        })
    });
    return acc.concat([...arr]);
}, Promise.resolve([]));
const followers = await Promise.all(list1.map(async item => {
    const username = await (await item.getProperty('innerText')).jsonValue();
    const pic = pics1.find(p => p.username === username) || { avatar: "" };
    return {
        avatar: await pic.avatar,
        username
    }
}));

const closeBtn = await page.$('div[role="dialog"] > div > div > div > div:nth-of-type(2) > button');
await closeBtn.evaluate(btn => btn.click());

const followingBtn = await page.$('div[id=react-root] > section > main > div > header > section > ul > li:nth-child(3) > a');
await followingBtn.evaluate(btn => btn.click());

await page.waitFor(3000);
const followingDialog = 'div[role="dialog"] > div > div:nth-child(3)';
await page.waitForSelector('div[role="dialog"] > div > div:nth-child(3) > ul');
await scrollDown(followingDialog, page);

console.log("getting following");
const list2 = await page.$$('div[role="dialog"] > div > div:nth-child(3) > ul > div > li > div > div > div:nth-child(2) > div > a');
await page.waitForSelector('div[role="dialog"] > div > div:nth-child(3) > ul > div > li > div > div > div > div > a > img');
avatarPaths = [
    'div[role="dialog"] > div > div:nth-child(3) > ul > div > li > div > div > div > div > a > img',
    'div[role="dialog"] > div > div:nth-child(3) > ul > div > li > div > div > div > span > img'
]
const pics2 = await avatarPaths.reduce(async (accProm, path) => {
    const acc = await accProm;
    const arr = await page.$$eval(path, res => {
        return res.map(pic => {
            const alt = pic.getAttribute('alt');
            const strings = alt.split(/[']/g);
            return {
                username: strings[0],
                avatar: pic.getAttribute('src')
            }
        })
    });
    return acc.concat([...arr]);
}, Promise.resolve([]));
const following = await Promise.all(list2.map(async item => {
    const username = await (await item.getProperty('innerText')).jsonValue()
    const pic = pics2.find(p => p.username === username) || { avatar: "" };
    return {
        avatar: await pic.avatar,
        username
    };
}));

const followerCnt = followers.length;
const followingCnt = following.length;
console.log(`followers: ${followerCnt}`);
console.log(`following: ${followingCnt}`);

const notFollowingYou = following.filter(item => !followers.find(f => f.username === item.username));
const notFollowingThem = followers.filter(item => !following.find(f => f.username === item.username));
await browser.close();
return { 
    followerCnt, 
    followingCnt, 
    notFollowingYou, 
    notFollowingThem, 
    followers, 
    following
};

};

async function scrollDown(selector, page) { await page.evaluate(async selector => { const section = document.querySelector(selector); await new Promise((resolve, reject) => { let totalHeight = 0; let distance = 100; const timer = setInterval(() => { var scrollHeight = section.scrollHeight; section.scrollTop = 100000000; totalHeight += distance;

            if (totalHeight >= scrollHeight){
                clearInterval(timer);
                resolve();
            }
        }, 100);
    });
}, selector);

}

module.exports = { script };