cta-wave / mezzanine

This repo contains scripts that will build annotated test content from specific source content, compatible with the WAVE device playback test suite.
BSD 3-Clause "New" or "Revised" License
2 stars 2 forks source link

do red triangles work for automated detection of the edge of the content? #23

Open jpiesing opened 3 years ago

jpiesing commented 3 years ago

The red triangles were put in the mezzanine content for detecting the edge of the content. In one discussion it was suggested that the red triangles were not suitable for automated detection of the content edge. I don't remember an alternative being proposed but I was thinking about a pattern with stripes in a series of colours - horizontal stripes for the top and bottom edge and vertical stripes for the left and right edge. The observation for this is in not in the phase 1 contract for the observation framework but we shouldn't forget it.

nicholas-fr commented 3 years ago

I'm currently evaluating whether we could use a 1px border around the edge of the video for automating edge detection, in white to avoid issues with colour subsampling. No conclusion yet. Suggestions are welcome.

image

nicholas-fr commented 3 years ago

As part of this ongoing investigation, the latest update for automating edge detection includes a 2px border (outer 1px white and inner 1px black) around the mezzanine content.

image

andyburras commented 3 years ago

Thoughts are:

andyburras commented 2 years ago

We've been giving this topic more thought and discussion which has opened something of "a can of worms".

DPCTF spec has multiple occurrences of:

"Every video frame S[k,s] shall be rendered such that it fills the entire video output window..."

Also:

5.2.2 "With setting up a source buffer for a video media type, either

  • a pre-determined display video output that matches the aspect ratio and size of the content (height and width) is established, or
  • a full-screen display is established"_

6.5.1 "A source buffer is expected to be established for each media type individually by [...] Create a proper output environment for each established source buffer a. For video a pre-determined display window that matches the aspect ratio and either i. default to the size of the content (height and width) of the CMAF Principal Header is established, ii. or as an option a full screen mode may be used as well."

So a device is expected to create a display window that matches the video's aspect ratio and optionally may use a full-screen mode. Test Runner always uses the latter option and sets the display to full screen. Not using the full-screen case would exacerbate the difficulties of getting good captures on small screen devices.

However for testing purposes, considering real-world devices has raised a number of issues:

The upshot is that it is dangerous to make any assumptions about exactly what will be visible of the video at the edges. Also any testing may have to be aimed at the majority of devices... novel designs may not be amenable to testing.

One idea is to use video along the lines of: (N.B. this is just a mock-up, so far we've done no prototyping to test the viability of this approach)

scaling-detection001

The observation software will then need to apply techniques to isolate the scale bars and determine how much is viewable of the vertical and horizontal, whether the markers are evenly spaced, etc. Results could then be checked against threshold values. Note that this would not be able to detect whether the video extended to the edges of the display.

This warrants further discussion to understand what may and may not be testable.

jpiesing commented 2 years ago

There are a number of assumptions in the comment which I think are mistaken ...

So a device is expected to create a display window that matches the video's aspect ratio and optionally may use a full-screen mode. Test Runner always uses the latter option and sets the display to full screen. Not using the full-screen case would exacerbate the difficulties of getting good captures on small screen devices.

We have one specific test for where full screen mode is to be selected, 8.11 "Full Screen Playback of Switching Sets". For all other tests, an HTML5 video object needs to be used that is not full screen. If a full screen mode is used all the time then this is a bug in the tests.

The HTML5 video object should not be the size of the video but should be the largest possible size that fits entirely on the device while still having the same aspect ratio as the video.

Mobile devices are not necessarily 16:9, e.g. phones exist with 20:9, 19.5:9, 18.5:9, 18:9, etc., hence the display window in full-screen mode will not necessarily map directly to the whole screen.

TV sets also exist with 21:9 sometimes. For all tests other than 8.11, please see above, the HTML5 video object needs to be as large as will fit entirely on the device while preserving the aspect ratio.

Mobile phones may take device-specific decisions on how best to size videos to their various screen aspect ratios. They may size to fit e.g. the whole of the vertical and leave space on the horizontal. Or they may choose by default to "zoom" and crop some of the vertical so that the video fills a greater portion of the display. This zoom mode may be configurable, if so testers could be instructed to disable it. However there is the question as to whether it is worth testing a device with a setting that the majority of users will probably never select. N.B. there isn't a specific requirement in the DPCTF specification that the whole of the "display window" is actually viewable on screen... I'm not clear on whether this is the intention or not.

I would be surprised if this would happen if there is a 16:9 HTML video element.

For TV screens, ITU-R BT.1848-1 and SMPTE ST 2046-1 both define safe areas for TVs, e.g. ST 2046-1 defines the Safe Action Area as 93% of the width and 93% of the height of the Production Aperture. Assumptions shouldn't be made about the what is viewable on the TV display outside these areas. Some regions/standards bodies may define stricter requirements in their TV specifications but these are not necessarily universally adopted.

In other places, there were lengthy discussions that apparently the trend is for TVs to just display the full signal with no overscan at all.

The observation software will then need to apply techniques to isolate the scale bars and determine how much is viewable of the vertical and horizontal, whether the markers are evenly spaced, etc. Results could then be checked against threshold values. Note that this would not be able to detect whether the video extended to the edges of the display.

I wonder if the observation discussed here is too complex for full screen mode and should just be omitted or deferred.

jpiesing commented 2 years ago

See also https://github.com/cta-wave/device-playback-task-force/issues/75 about the definition of full screen.

andyburras commented 2 years ago

The HTML5 video object should not be the size of the video but should be the largest possible size that fits entirely on the device while still having the same aspect ratio as the video.

That could be my misunderstanding. I had a recollection from a while ago that originally the output was in a "window" but this was altered to full-screen to improve the capture on small screen devices. @louaybassbouss could you confirm the behaviour?

I would be surprised if this would happen if there is a 16:9 HTML video element.

Agreed. My concerns were with the full-screen mode case.

I wonder if the observation discussed here is too complex for full screen mode and should just be omitted or deferred.

Yes that may need to be the case for the 8.11 test, behaviour may be very device-dependent when in full-screen mode.

Ignoring 8.11, the aim would be to be able to sufficently measure the requirements: "Every video frame S[k,s] shall be rendered such that it fills the entire video output window..." "The rendering for each track is scaled to the height and width of the predetermined window." "No visible shifts of objects in the video" "No visible spatial offset of pixels in the video"

gitwjr commented 1 year ago

Was in RFP but is now deferred for future work.