ageitgey / node-unfluff

Automatically extract body content (and other cool stuff) from an html document
Apache License 2.0
2.15k stars 221 forks source link

Parse Page Schema.org Data #103

Open ISNIT0 opened 5 years ago

ISNIT0 commented 5 years ago

Google recommends that pages include structured data schema: https://developers.google.com/search/docs/guides/intro-structured-data

Specifically, I'm interested in ClaimReview data (https://schema.org/ClaimReview), but this structured data has significant overlap with the other data extracted by Unfluff which could be used to augment/fallback/verify the existing <meta> data

melvincarvalho commented 4 years ago

Im interested in this too. Is this library actively maintained?