ruipgil / scraperjs

A complete and versatile web scraper.
MIT License
3.71k stars 188 forks source link

Generate scroll-down event to force images to load #8

Open rvernica opened 10 years ago

rvernica commented 10 years ago

Hello,

I am trying to use this to scrape some images from a website. The problem is the image URLs are only generated if the user scrolls down the page and the images get into the view port. Is it possible to generate a Page Down event (or Ctrl + End or just go at the end of the page) inside the DynamicScraper?

Thanks!

ruipgil commented 10 years ago

Loading images is not intended since it slows down scraping, instead just get their URL and then download them.

rvernica commented 10 years ago

I don't want to load them I just need their URLs. The problem is the URLs are not generated until the images get into the view port. When the page initially loads, the images at the bottom of the page have the URL set to "1px.png" or something like that. If I scroll down and the images get into the view port, their URL is generated and points to the right image.

So, I need to somehow scroll down the page so the JavaScript code runs and generates the URLs for these images.

Initially, even the top images don't have the URLs generated, but because they are in the view port, their URL gets generated and I can scrape their URLs correctly.

ruipgil commented 10 years ago

You can always generate DOM events with the DynamicScraper. But try to inspect the code and see where the information about the images is stored, and get that.

rvernica commented 9 years ago

The JavaScript for generating the image URLs is pretty complex and very hard to figure out.

Could you provide a small example on how to generate a DOM event?

Additionally, it is possible to specify a pixel height for the view port? I assume there is some default width and height in which the page is loaded by the DynamicScraper.

ruipgil commented 9 years ago

There's this thread that might be useful, but not right now, since there's no proper way to set the viewport size.