sindresorhus / capture-website

Capture screenshots of websites
MIT License
1.92k stars 136 forks source link

hideElements and elements options not working #43

Closed IsaacLaquerre closed 4 years ago

IsaacLaquerre commented 4 years ago

I'm creating a webscraper for a Discord bot and I scrape the webpage displaying players' info on RealmEye (for a game called Realm of the Mad God).

What I'm trying to do, essentially, is to get a character's image with a screenshot (I would just scrape the url of the image, but there's no href attribute, it's built with classes when the webpage loads).

I used hideElements: ["#mys-content"] (hides ads) and elements: ".character" (to frame the picture around the character's image). But when I run the code, it doesn't hide the ads and it just takes a pic of the full page (as seen below)

image

Here's my code:

const captureWebsite = require('capturewebsite');

[...]

captureWebsite.file("https://www.realmeye.com/player/Vyle", "charRaw.png", {
            elements: ".character",
            hideElements: [
                "#mys-content"
            ],
            fullPage: true
        }).then(() => {
            //Handling image
        });
sindresorhus commented 4 years ago

Can you try the delay option? Set it to 20 seconds or something. The element might not exist yet when the website is loaded. Also see waitForElement, which will make it wait for that element.

IsaacLaquerre commented 4 years ago

I just tried, doesn't work :(

I tried putting a 20s delay and waited for the ad wrapper to load before hiding it

{
        delay: 20,
        waitForElement: "#mys-content",
        elements: ".character",
        hideElements: [
            "#mys-content"
        ],
        fullPage: true
}

image

IsaacLaquerre commented 4 years ago

Update: Even framing it around a simple <p class="entity-name"> doesn't render it, it simply stays at position 0, 0 and takes a screenshot,

I changed the width and height options to 50 each (because the image I need is 50px by 50px) and it just takes a picture of the top left corner of the webpage

image

ghost commented 4 years ago

I believe that instead of iterating over potentially non-existing elements on the page

if (options.hideElements) {
    await Promise.all(options.hideElements.map(selector => page.$$eval(selector, hideElements)));
}

It could be more beneficial to actually inject a style element to the page like this:

if (options.hideElements) await page.addStyleTag({
    content: `${options.hideElements.join(', ')} { display: none; }`
});

This way there should not be any situation when the element that is supposed to be hidden is not loaded when element hiding executes.

sindresorhus commented 4 years ago

@krnik Good idea! 👌