j0k3r / graby

Graby helps you extract article content from web pages
MIT License
363 stars 73 forks source link

Handle multiple ld+json graph #237

Open j0k3r opened 4 years ago

j0k3r commented 4 years ago

For https://www.dissentmagazine.org/article/why-the-left-needs-liberals

We have that json:

{
    "@context": "https://schema.org",
    "@graph": [
        {
            "@type": "WebSite",
            "@id": "https://www.dissentmagazine.org/#website",
            "url": "https://www.dissentmagazine.org/",
            "name": "Dissent Magazine",
            "description": "An independent quarterly magazine, publishing some of America's most exciting long-form political and cultural criticism since 1954.",
            "potentialAction": [
                {
                    "@type": "SearchAction",
                    "target": "https://www.dissentmagazine.org/?s={search_term_string}",
                    "query-input": "required name=search_term_string"
                }
            ],
            "inLanguage": "en-US"
        },
        {
            "@type": "WebPage",
            "@id": "https://www.dissentmagazine.org/article/why-the-left-needs-liberals#webpage",
            "url": "https://www.dissentmagazine.org/article/why-the-left-needs-liberals",
            "name": "Why the Left Needs Liberals | Dissent Magazine",
            "isPartOf": {
                "@id": "https://www.dissentmagazine.org/#website"
            },
            "datePublished": "2019-10-07T16:56:07+00:00",
            "dateModified": "2019-10-07T16:56:07+00:00",
            "inLanguage": "en-US",
            "potentialAction": [
                {
                    "@type": "ReadAction",
                    "target": [
                        "https://www.dissentmagazine.org/article/why-the-left-needs-liberals"
                    ]
                }
            ]
        }
    ]
}

Related https://github.com/wallabag/wallabag/issues/4378