Closed jasan-s closed 7 years ago
Well it depends how you fetch the page. Are you using http requests or are you emulating a browser like with phantomjs or pupeteer or electron?
@Yomguithereal I am just getting started with web scraping and this is the first tool i have used. I'm currently using request to make a get request. I suppose I need to use a a tool that emulates browser. Can you recommend one?
For a long time, the tool to use was PhantomJS. But since we now have a headless Chrome, I would recommend to try pupeteer.
@Yomguithereal tried Pupeteer and its awesome. Thanks for the recommendation. Now how can I use artoo js with Pupeteer. I tried by installing npm install artoo-js but its not working. I also posted a separate issue #271
For some reason the scraped image url's change when working with cheerio in node . i.e the original image url is :
However after scraping the Url turns to this url:
If I scrape it while in chrome browser console using Artoo.js bookmark. The Url stays same as original. Why is it changing when i use it in node?.Any Suggestions
I also posted on Stackoverflow
Update: I think I found the issue but not the solution. It seems the scraper method runs before the correct images have loaded on page. the changed URL is just the placeholder image. How can I wait till the entire page loads.