Closed michaelforrest closed 6 years ago
Hello! Thanks for opening this issue.
In open-graph-scraper@3.2.0
I updated the module to send back an array of all of the images on page if there isn't any open graph images.
@jshemas thanks, I think that’s definitely a good addition. Have you checked it against my example though? I think something more complicated might be happening here - if I view source on the page the scraper sees it doesn’t have the open graph tags yet so I don’t know what the Facebook scraper is doing differently?
Hello, i'm not able to use the Facebook debugger since I don't have a facebook account. Facebook is probably doing a lot more then just scraping open graph info. What other information would you like OGS to pull back?
I can probably do a one off thing to get amazon reviews, if that is what you need.
Hi @jshemas, again thanks for responding. It's not the reviews bit per se (that just seems to be something Amazon adds into its thumbnails) it's more just figuring out what's weird about how / when Amazon exposes og tags for its products. Seems weird what comes back from curl even.
It might not even be exposing og tags! But I'm sure I saw them on at least one product page!
No it does seem that Amazon doesn't expose opengraph tags after all, so Facebook must be making an exception to get this (maybe hitting an Amazon API). This all seems beyond the scope of open-graph-scraper
so I'm happy to close this issue.
In case anyone else needs this, solution for amazon found here: https://github.com/jhy/jsoup/issues/976 just need to set User-Agent
if you're a noob like me, here is the full answer further to @TylerAHolden 's guidance
const ogs = require("open-graph-scraper");
const options = {
url: "https://amzn.to/2Is8sCR",
headers: {
"user-agent": "Googlebot/2.1 (+http://www.google.com/bot.html)",
},
};
ogs(options, (error, results, response) => {
console.log("error:", error); // This is returns true or false. True if there was a error. The error it self is inside the results object.
console.log("results:", results); // This contains all of the Open Graph results
console.log("response:", response); // This contains the HTML of page
});
When I scrape this Amazon url: https://amzn.to/2Is8sCR, I expect to see the same results as the Facebook debugger (here) which follows a redirect to here, but for some reason I always get this:
Here's what I see
Here's what I want
Thanks, hope somebody can help!