Closed lizcameron closed 4 years ago
To achieve this, we would need to make slight modifications to the media player metadata helper.
Basically, a single metadata object for each piece of matching audio/video content on a page would need to be returned from that helper, assuming there are multiple audio/video contents.
For each metadata object returned, we also need to set a property of '@context': 'http://schema.org',
to ensure that it is picked up as a valid schema.org
schema (see more...), since it will be a single JSON-LD object.
Returning a valid AudioObject/VideoObject ensures we don't have to set the @listContent
property like we currently do as these gets aggregated and distinguished by the validation tool.
In a nut shell, the change to the helper would be like the snippet below:
const mediaPlayerMetadata = blocks => {
const aresMediaBlocks = pathOr(null, ['model', 'blocks'], blocks);
if (!aresMediaBlocks || aresMediaBlocks.length < 1) {
return null;
}
const aresMetadataBlocks = aresMediaBlocks.filter(
block => block.type === 'aresMediaMetadata',
);
const metadataBlock = aresMetadataBlocks[0];
const format = pathOr(null, ['model', 'format'], metadataBlock);
const type = format === 'audio' ? 'AudioObject' : 'VideoObject';
const metadata = {
'@context': 'http://schema.org',
'@type': type,
name: pathOr(null, ['model', 'title'], metadataBlock),
description: pathOr(null, ['model', 'synopses', 'short'], metadataBlock),
duration: pathOr(
null,
['model', 'versions', [0], 'duration'],
metadataBlock,
),
thumbnailUrl: getThumbnailUri(metadataBlock),
uploadDate: pathOr(
null,
['model', 'versions', [0], 'availableFrom'],
metadataBlock,
),
};
return metadata;
};
making this change and using the tool to validate this page (http://localhost:7080/news/articles/c3wmq4d1y3wo), should give something that looks like this
We also need to make changes to most of the properties we're setting, as you can see, warnings/errors are shown when viewing the data for a VideoObject
Unfortunately, the duration
and uploadDate
properties from the aresmetadata data are given as numbers in seconds and would have to use a valid ISO 8601 date format which is in model.versions[0].durationISO8601
.
Additional properties that would need to be added like contentURL
and embedURL
, we would need to pass the value of the embedSource
generated here as a prop down to the Metadata component
Thanks for the investigation @rhenshaw56. The approach and pseudocode look reasonable to me, though I think we'd want to break the mediaPlayerMetadata
function down into smaller units during implementation.
One question - if we output separate AudioObjects/VideoObjects does the validator still understand that these are all part of the same Article? i.e. is a hierarchical relationship between them still preserved? Is this how CNN/others handle it?
One question - if we output separate AudioObjects/VideoObjects does the validator still understand that these are all part of the same Article? i.e. is a hierarchical relationship between them still preserved? Is this how CNN/others handle it?
@jamesdonoh yes, a hierarchical relationship between them is still preserved and the validator understands how to output and distinguish a list of various AudioObjects/VideoObjects
This is a good investigation, @rhenshaw56.
Can I suggest we:
path
where the default is to return null/undefined.@jamesdonoh Is this how CNN/others handle it?
CNN
<script type="application/ld+json"name="metaScript">{
"@context":"https://schema.org",
"@type":"VideoObject",
"name":"CNN asks Zelensky about investigation claims",
"description":"Ukrainian President Volodymyr Zelensky weighed in on claims that he was <a href="http://www.cnn.com/2019/11/19/politics/volodymyr-zelensky-burisma-probe-intl/index.html" target="_blank">ready to announce an investigation into Burisma Holdings,</a> a Ukrainian energy company linked to the son of former Vice President Joe Biden, following a phone call with President Donald Trump.",
"thumbnailURL":"https://cdn.cnn.com/cnnnext/dam/assets/191001131044-02-zelensky-1001-large-169.jpg",
"image":"https://cdn.cnn.com/cnnnext/dam/assets/191001131044-02-zelensky-1001-large-169.jpg",
"duration":"PT1M15S",
"uploadDate":"2019-11-19T14:16:31Z",
"contentUrl":"https://edition.cnn.com/videos/politics/2019/11/19/volodymyr-zelensky-burisma-investigation-allegation-donald-trump-impeachment-hearings-pleitgen-liveshot-intl-ldn-vpx.cnn",
"url":"https://edition.cnn.com/videos/politics/2019/11/19/volodymyr-zelensky-burisma-investigation-allegation-donald-trump-impeachment-hearings-pleitgen-liveshot-intl-ldn-vpx.cnn",
"embedUrl":"https://fave.api.cnn.io/v1/fav/?video=politics/2019/11/19/volodymyr-zelensky-burisma-investigation-allegation-donald-trump-impeachment-hearings-pleitgen-liveshot-intl-ldn-vpx.cnn&customer=cnn&edition=international&env=prod"
}</script>
Do we need to include the
expires
property as well?
@HarveyPeachey I think we would need @lizcameron's input on that
Create a separate function that serves the purpose of decorating the video object with these properties. This would return a valid Schema that can be tested. An example of a similar function can be found here.
@simonsinclair sorry, I don't understand what this is doing, could you maybe put it in code in the context of the metadata usage.
Use Ramda path where the default is to return null/undefined
Also regarding this, I believe we use R.pathOr
extensively in simorgh, so it's fine as it is and we would prefer forcing to have null
values to undefined
(or a mixture of both) which R.path
returns if the property specified is not on the object
Thanks for the reviews @simonsinclair @HarveyPeachey - I think we can worry about low-level details like which library functions to use when we do the implementation.
@rhenshaw56 please chase up Harvey's comment about expires
and add this and Simon's suggestion about a decorator function as notes on the implementation ticket. Am closing this for now as the investigation is done.
May I suggest we do something like below to avoid having multiple <script type="application/ld+json">
tags on the page? I think this would look a lot tidier, and I don't think would require too much manipulation of the current metadata helper.
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@graph":
[
{
"@type": "VideoObject",
"description": "foo",
"name": "Hello",
"thumbnailUrl": "http://foo.com/img.png",
"uploadDate": "2019-08-08"
},
{
"@type": "VideoObject",
"description": "foo",
"name": "Hello",
"thumbnailUrl": "http://foo.com/img.png",
"uploadDate": "2019-08-08"
},
{
"@type": "AudioObject",
"description": "foo",
"name": "Hello",
"thumbnailUrl": "http://foo.com/img.png",
"uploadDate": "2019-08-08"
}
]
}
</script>
This is valid markup according to Google's Structured Data Testing Tool
May I suggest we do something like below to avoid having multiple
<script type="application/ld+json">
tags on the page? I think this would look a lot tidier, and I don't think would require too much manipulation of the current metadata helper.<script type="application/ld+json"> { "@context": "http://schema.org", "@graph": [ { "@type": "VideoObject", "description": "foo", "name": "Hello", "thumbnailUrl": "http://foo.com/img.png", "uploadDate": "2019-08-08" }, { "@type": "VideoObject", "description": "foo", "name": "Hello", "thumbnailUrl": "http://foo.com/img.png", "uploadDate": "2019-08-08" }, { "@type": "AudioObject", "description": "foo", "name": "Hello", "thumbnailUrl": "http://foo.com/img.png", "uploadDate": "2019-08-08" } ] } </script>
This is valid markup according to Google's Structured Data Testing Tool
thanks @12, this looks good but we would be doing this under the assumption that a list of Audio and Video metadata objects would be passed into the Metadata component which is ideal if we have multiple Audio/Video content on the page, but that's not the case as it is because the Metadata used in the mediaplayer would be rendered multiple times for each Audio/Video content on a page each having it's own aresMediaMetada
object. I haven't actually seen a case where we have a list of media metadata all in one place
Is your feature request related to a problem? Please describe. With the introduction of AV to the article page, we want to introduce video and audio-specific schema.org data in a attempt to make the metadata of these pages richer. This involves including VideoObject and AudioObject schema for article pages.
Describe the solution you'd like An article page with a media block should contain schema.org metadata that specifies the audio and/or video objects present on the page. If possible, we would like to distinguish between audio object and video object.
See Jira ticket/ask @lizcameron for more examples of the correct implementation of video object structured data.
Video object detail:
Audio object detail:
Following investigation, create issues to implement schema.org for video and audio player.
We will also want to identify testing requirements and any technical issues that we'll need to overcome to be able to implement AV structured data as described within this issue.
Off the back of this investigation, a developer will sit with business and a tester to discuss an appropriate way forward.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Testing notes
Additional context Add any other context or screenshots about the feature request here.