Closed alexpluto closed 3 years ago
Here is another link that is failing: https://snowflakegelato.co.uk/
When this link also has a title, hero image, etc.
That's actually not a metascraper issue.
For example, check rules under metascraper-description. These rules are applied against the HTML markup over the target URL in order to find the first rule with a valid value.
If the target URL doesn't have any of these rules, then metascraper doesn't find any value to extract.
These target URLs have very poor HTML markup in terms of sharing.
@Kikobeats thanks for the quick reply! Really appreciate.
Strangely, with the All Trails link (https://www.alltrails.com/lists/kate-agnew-ny-trails), I can see the of:title, of:image but it is missing og:description
Shouldn't these fields be returned, even if description fails?
Is the error Error: HTTPError: Response code 403 (Forbidden)
from the missing description?
It's a network issue related to antibot protection that the target URL has.
In order to being possible to fetch the content there, I recommend you setup your own proxy service against Microlink: https://microlink.io/docs/api/parameters/proxy
@Kikobeats thank you! I'll look into this, we do use a proxy for other requests.
Hey,
I wrote a blogpost explaining what's happening there:
Great post, thank you!
Prerequisites
package.json
.Subject of the issue
When passing this link in: https://www.alltrails.com/lists/kate-agnew-ny-trails
An error is returned, but when inspecting the HTML for this page, it has metadata.
Error: HTTPError: Response code 403 (Forbidden)
Steps to reproduce
Get metadata for: https://www.alltrails.com/lists/kate-agnew-ny-trails
Expected behaviour
Metadata is returned
Actual behaviour
Error returned: https://www.alltrails.com/lists/kate-agnew-ny-trails