bbc / simorgh

The BBC's Open Source Web Application. Contributions welcome! Used on some of our biggest websites, e.g.
https://www.bbc.com/pidgin
Other
1.41k stars 225 forks source link

Add Schema.org metadata #577

Closed BogdanDogaru closed 5 years ago

BogdanDogaru commented 6 years ago

Is your feature request related to a problem? Please describe. Our article page is currently missing metadata. This PR will add all the available Schema.org meta.

NB, the same solution can be applied to AMP as it allows JSON LD scripts.

Describe the solution you'd like Add the following fields, including the logic necessary to render them correctly:

The final structure should look like this (from Jim's comment below):

<script type="application/ld+json">
    {
      "@context": "http://schema.org",
      "@type": "ReportageNewsArticle",
      "url": "https://www.bbc.com/news/articles/[OPTIMO ID]",
      "publisher": {
        "@type": "NewsMediaOrganization",
        "name": "BBC News",
        "publishingPrinciples": "http://www.bbc.com/news/help-41670342",
        "logo": {
          "@type": "ImageObject",
          "width": 1024,
          "height": 576,
          "url": "https://www.bbc.com/news/special/2015/newsspec_10857/bbc_news_logo.png?cb=1"
        }
      },
      "datePublished": "[FIRSTPUB e.g. 2018-09-05T14:35:15+01:00]",
      "dateModified": "[LASTPUB e.g. 2018-09-05T14:35:15+01:00]",
      "headline": "[SEO HEADLINE]",
      "image": {
        "@type": "ImageObject",
        "width": 1024,
        "height": 576,
        "url": "https://www.bbc.com/news/special/2015/newsspec_10857/bbc_news_logo.png?cb=1"
      },
      "thumbnailUrl": "https://www.bbc.com/news/special/2015/newsspec_10857/bbc_news_logo.png?cb=1",
      "author": {
        "@type": "NewsMediaOrganization",
        "name": "BBC News",
        "logo": {
          "@type": "ImageObject",
          "width": 1024,
          "height": 576,
          "url": "https://www.bbc.com/news/special/2015/newsspec_10857/bbc_news_logo.png?cb=1"
        },
        "noBylinesPolicy": "http://www.bbc.com/news/help-41670342#authorexpertise"
      },

    }

    </script>

Additional context Some fields might require further discussion, and we will initially hard-code them.

Testing notes

Optimo Category Type
Analysis AnalysisNewsArticle
Ask The Audience AskPublicNewsArticle
Explainer BackgroundNewsArticle
Opinion OpinionNewsArticle
News or Feature ReportageNewsArticle
Review ReportageNewsArticle
Fact check ReportageNewsArticle
Summary ReportageNewsArticle
Polls and Surveys ReportageNewsArticle
BogdanDogaru commented 6 years ago

~Blocked waiting on #573~

sareh commented 6 years ago

This is existing on a current News article (https://www.bbc.co.uk/news/uk-45421445), for context:

  <script type="application/ld+json">
    {
      "@context": "http://schema.org",
      "@type": "ReportageNewsArticle",
      "url": "https://www.bbc.co.uk/news/uk-45421445",
      "publisher": {
        "@type": "NewsMediaOrganization",
        "name": "BBC News",
        "publishingPrinciples": "http://www.bbc.co.uk/news/help-41670342",
        "logo": {
          "@type": "ImageObject",
          "url": "https://www.bbc.co.uk/news/special/2015/newsspec_10857/bbc_news_logo.png?cb=1"
        }
      },
      "datePublished": "2018-09-05T14:35:15+01:00",
      "dateModified": "2018-09-05T14:35:15+01:00",
      "headline": "Novichok attack Russian 'agents' named",
      "image": {
        "@type": "ImageObject",
        "width": 720,
        "height": 405,
        "url": "https://ichef.bbci.co.uk/images/ic/720x405/p06kbpnd.jpg"
      },
      "thumbnailUrl": "https://ichef.bbci.co.uk/images/ic/208x117/p06kbpnd.jpg",
      "author": {
        "@type": "NewsMediaOrganization",
        "name": "BBC News",
        "logo": {
          "@type": "ImageObject",
          "url": "https://www.bbc.co.uk/news/special/2015/newsspec_10857/bbc_news_logo.png?cb=1"
        },
        "noBylinesPolicy": "http://www.bbc.co.uk/news/help-41670342#authorexpertise"
      },
      "mainEntityOfPage": "https://www.bbc.co.uk/news/uk-45421445",
      "video": {
        "@type": "VideoObject",
        "name": "Police make appeal over Novichok suspects",
        "description": "The public have been asked to contact police if they have seen Alexander Petrov or Ruslan Boshirov, who are wanted over the attempted murder of former Russian spy Sergei Skripal and his daughter Yulia.\nThe suspects are thought to have been using the names as aliases and are about 40.\nMr Skripal, 66, and his daughter Yulia, 33, were poisoned with nerve agent Novichok in March.\nScotland Yard's Assistant Commissioner Neil Basu, the head of UK counter-terrorism policing, described the suspects' movements as he showed journalists CCTV images from the time of the initial poisoning.",
        "duration": "PT4M39S",
        "thumbnailUrl": "https://ichef.bbci.co.uk/images/ic/208x117/p06kbpnd.jpg",
        "uploadDate": "2018-09-05T12:33:37+01:00"
      }
    }

    </script>
katikaa commented 6 years ago

Hello - it was mentioned during refinement that it may be better to follow a different approach for our schema metadata. In the current article page we output JSON-LD but it may be more beneficial to do RDFa or microdata. Our resident SEO experts are Rob Millard/Tom Chandler. Whoever picks up this ticket, can they chat to either of them? To decide the approach and accordingly amend the AC which are at the moment written in JSON-LD. Thanks :)

BogdanDogaru commented 6 years ago

Blocked on the discussion we need to have with Rob or Tom.

jimjohnsonrollings commented 5 years ago

NB. We should add articleBody to this list as we are no longer going to try it on PAL first.

jimjohnsonrollings commented 5 years ago

Also, in the example in the description, image and author should not be children of publisher.

ghost commented 5 years ago

JSON-LD, would be the way to go from Search Engine best practice

Some documentation: Google Structured Data guide > https://developers.google.com/search/docs/guides/intro-structured-data (with JSON-LD called out as recommended) Bing confirmation on JSON-LD > https://searchengineland.com/bing-confirmed-support-json-ld-formatted-schema-org-markup-293508 Extra, some Google patents ref JSON-LD >http://www.seobythesea.com/2018/06/structured-data-json-ld/

Let me know if I can help

ChrisBAshton commented 5 years ago

To clarify, we must use JSON LD - not rdfa or microdata. We cannot have a flexible renderer of articles and tie metadata to html structure.

jimjohnsonrollings commented 5 years ago

Updated sample JSON-LD block for the v1.0 release:

<script type="application/ld+json">
    {
      "@context": "http://schema.org",
      "@type": "ReportageNewsArticle",
      "url": "https://www.bbc.com/news/articles/[OPTIMO ID]",
      "publisher": {
        "@type": "NewsMediaOrganization",
        "name": "BBC News",
        "publishingPrinciples": "http://www.bbc.com/news/help-41670342",
        "logo": {
          "@type": "ImageObject",
          "width": 1024,
          "height": 576,
          "url": "https://www.bbc.com/news/special/2015/newsspec_10857/bbc_news_logo.png?cb=1"
        }
      },
      "datePublished": "[FIRSTPUB e.g. 2018-09-05T14:35:15+01:00]",
      "dateModified": "[LASTPUB e.g. 2018-09-05T14:35:15+01:00]",
      "headline": "[SEO HEADLINE]",
      "image": {
        "@type": "ImageObject",
        "width": 1024,
        "height": 576,
        "url": "https://www.bbc.com/news/special/2015/newsspec_10857/bbc_news_logo.png?cb=1"
      },
      "thumbnailUrl": "https://www.bbc.com/news/special/2015/newsspec_10857/bbc_news_logo.png?cb=1",
      "author": {
        "@type": "NewsMediaOrganization",
        "name": "BBC News",
        "logo": {
          "@type": "ImageObject",
          "width": 1024,
          "height": 576,
          "url": "https://www.bbc.com/news/special/2015/newsspec_10857/bbc_news_logo.png?cb=1"
        },
        "noBylinesPolicy": "http://www.bbc.com/news/help-41670342#authorexpertise"
      },

    }

    </script>

Notes:

jimjohnsonrollings commented 5 years ago

Google's Structured Data Testing Tool: https://search.google.com/structured-data/testing-tool/u/0/

pjlee11 commented 5 years ago

Could we please ensure the following is provided using the ServiceContext for future proofing the implementation of services other than BBC News.

pjlee11 commented 5 years ago

For publishingPrinciples and noBylinesPolicyon the current site, we link to an English help page for world service articles. When doing this work we should decide whether to hardcode these values or duplicate the URL for each service config (or have a nice fallback logic in the service config)

jimjohnsonrollings commented 5 years ago

For publishingPrinciples and noBylinesPolicyon the current site, we link to an English help page for world service articles. When doing this work we should decide whether to hardcode these values or duplicate the URL for each service config (or have a nice fallback logic in the service config)

It would be good to have the option of adding translated versions of these in the future - at the moment though that isn't necessary, as the position is that we always link to the English-language version