Test content media generation for DPCTF section 9 testing

andyburras commented 3 years ago

The DPCTF section 8 requirements are testing single-track media playback, so test content can be generated and accessed as a single CMAF/WAVE track referenced by a single MPD.

Section 9 requirements are testing the presentation of audio and video including the AV sync., so CMAF/WAVE tracks need to be combined.

Currently the test vector at http://dash.akamaized.net/WAVE/vectors/ is held as a set of media files described by a single MPD. To test combinations of video and audio, either:

A) a single MPD will need to be generated that references the AV tracks (and later subtitles) for every required combination. (Note: Some codecs may require specific commercial tools to produce the CMAF track and MPD. So for some AV combinations generating a single combined MPD may not be possible with any available tool. It may require a bespoke script that can merge MPDs into one).
B) or each media track will be described by a different MPD. A test database will need to describe the required combinations. The testing solution will need to reference multiple MPDs and combine the required media tracks at run-time.

There are also a number of procedural questions arising from section 9 testing:

What combinations of video and audio will need to be tested? E.g. in CSTF there are 1 AVC and 6 HEVC variants. Would every audio variant need to be tested in combination with every video variant? Or would testing in combination with just AVC suffice? Or AVC and 1 HEVC variant? Are there combinations of AV that would not make sense to test in the real-world (e.g. would a high-end audio codec variant ever be used in combination with low resolution AVC)?
In some cases, the media proponents for the audio and the video codecs will be different. The proponents will be providing the individual CMAF/WAVE media tracks. Who is responsible for generating the combinations required by (A) or (B)?
What will be the procedure when a new codec arrives e.g. AV1?
What will be the procedure if in the future it is determined that the mezzanine content needs to change? The regeneration of the test media will hopefully be scripted and semi-automated, but how will the validation of the new test content be done?

jpiesing commented 3 years ago

@RodolpheFouquet @louaybassbouss

Here are some thoughts;

There are also a number of procedural questions arising from section 9 testing:

* What combinations of video and audio will need to be tested? E.g. in CSTF there are 1 AVC and 6 HEVC variants. Would every audio variant need to be tested in combination with every video variant? Or would testing in combination with just AVC suffice? Or AVC and 1 HEVC variant? Are there combinations of AV that would not make sense to test in the real-world (e.g. would a high-end audio codec variant ever be used in combination with low resolution AVC)?

I would expect every media profile to be tested with 8.2 to 8.14 and 9.2 to 9.4 all inclusive. The simplest answer would be to test each video media profile with HE-AAC audio and each audio profile with AVC. What is not needed is testing all resolutions and content options. It might even be sufficient to use CMAF Switching Sets with one video track and one audio track. For HE-AAC, E-AC-3 and AC-4, why not just use 1080p25 or 1080p30 AVC?

* In some cases, the media proponents for the audio and the video codecs will be different. The proponents will be providing the individual CMAF/WAVE media tracks. Who is responsible for generating the combinations required by (A) or (B)?

Do the combinations actually need generating at all? @louaybassbouss Will the JavaScript code for 9.2-9.4 use one MPD with both video and audio inside it or one MPD for video and one MPD for audio?

* What will be the procedure when a new codec arrives e.g. AV1?

Use an existing HE-AAC MPD and media segments for 9.2-9.4?

* What will be the procedure if in the future it is determined that the mezzanine content needs to change? The regeneration of the test media will hopefully be scripted and semi-automated, but how will the validation of the new test content be done?

I would hope that the CMAF validation in the DASH-IF validator will be made usable in workflows and not just through a UI.

louaybassbouss commented 3 years ago

Agree with @jpiesing comments

* In some cases, the media proponents for the audio and the video codecs will be different. The proponents will be providing the individual CMAF/WAVE media tracks. Who is responsible for generating the combinations required by (A) or (B)?
Do the combinations actually need generating at all? @louaybassbouss Will the JavaScript code for 9.2-9.4 use one MPD with both video and audio inside it or one MPD for video and one MPD for audio?

@jpiesing you need to pass 2 MPDs one for video and one for audio (see sample.csv ). If a test requires only video, then you can keep the audio MPD empty. BUT if you have audio and video in the same MPD, you can pass the same MPD URL in the sample.csv and it will work as well (because the tests only consider the Video AdaptationSet/Representations .... from the Video MPD and ignore any audio and similar for audio MPD the video part will be ignored).

rbouqueau commented 3 years ago

(Note: Some codecs may require specific commercial tools to produce the CMAF track and MPD. So for some AV combinations generating a single combined MPD may not be possible with any available tool. It may require a bespoke script that can merge MPDs into one)

@andyburras Do you have examples?

The testing solution will need to reference multiple MPDs and combine the required media tracks at run-time.

@andyburras That's a good point I thought about. We can either duplicate the content or put the cmaf tracks under BaseURLs. Do you see any limitation with this approach?

louaybassbouss commented 3 years ago

The testing solution will need to reference multiple MPDs and combine the required media tracks at run-time.

@andyburras That's a good point I thought about. We can either duplicate the content or put the cmaf tracks under BaseURLs. Do you see any limitation with this approach?

@rbouqueau the tests accept one video MPD and one audio MPD. If you have multiple video MPDs, then we need to combine them via script in a single MPD. Alternatively you can use a single MPD with multiple periods.

rbouqueau commented 3 years ago

This is an important discussion because it touches to the architecture of the test-generation script that I'm working on.

The direction I'm taking right now to generate one encoding, then several CMAFs per encoding, then several MPDs per CMAF combination. This makes the assumption that each step may re-use in several ways the result from the previous step - let me know if it doesn't make sense.

It means the overall script shall take as an input:

A table linking the mezzanine input with some encoding parameters.
Some CMAF packagings based on 1.'s output streams.
Some MPD generation based on a set of 2.'s output streams.

Do you think it covers all the use-cases?

jpiesing commented 3 years ago

The only MPD combining I can see is for the tests for splice conditioned content, 8.8, 8.14 and 9.4. As @louaybassbouss explained, there is no need to combine video and audio into the same MPD.

@rbouqueau Can you expand a little more on 'several CMAFs per encoding' and 'several MPDs per CMAF'? Is this just to have a generic solution or did you have specific use-cases in mind? So far (and in my instructions) I was assuming 1 encoding = 1 CMAF = 1 MPD. I may have missed something.

jpiesing commented 3 years ago

(Note: Some codecs may require specific commercial tools to produce the CMAF track and MPD. So for some AV combinations generating a single combined MPD may not be possible with any available tool. It may require a bespoke script that can merge MPDs into one)

@andyburras Do you have examples?

@andyburras given @louaybassbouss 's explanation that the test runner takes 1 video MPD and 1 audio MPD, would merging video and audio into the same MPD ever be needed?

The testing solution will need to reference multiple MPDs and combine the required media tracks at run-time.

@andyburras That's a good point I thought about. We can either duplicate the content or put the cmaf tracks under BaseURLs. > Do you see any limitation with this approach?

Sorry but I really don't see the point that's being made here. For example, there's a CMAF AVC video switching set with one track (say 1080p25) and a CMAF audio switching set with one track (say E-AC-3). References to those are put into @louaybassbouss 's .csv file which then generates an HTML page including them. When loaded in a browser, that page will read each MPD, create MSE SourceBuffers (etc) and populate each SourceBuffer with media segments from the corresponding MPD.

andyburras commented 3 years ago

explanation that the test runner takes 1 video MPD and 1 audio MPD,

Yes, I think that means it's effectively doing option (B), which simplifies the test content generation considerably, and should remove the need for combining MPDs for audio and video content. As you point out, that then just requires splicing of content to be handled.

rbouqueau commented 3 years ago

@rbouqueau Can you expand a little more on 'several CMAFs per encoding' and 'several MPDs per CMAF'? Is this just to have a generic solution or did you have specific use-cases in mind? So far (and in my instructions) I was assuming 1 encoding = 1 CMAF = 1 MPD. I may have missed something.

Ah ah if someone missed something it is probably me ;) Please correct me when I'm wrong.

So for each element of the row 2) we have "1 encoding = 1 CMAF = 1 MPD" to generate one CFHD/AVC switching set with a single video track.

In addition we have to generate the special "switching set 1" (previously X or X1) with is an aggregation of row 2 (with stream ids >= 20) (although there is also stream id 1 included but the rationale is that we are trying to minimize the number of streams) + some audio.

Is it correct and complete?

@andyburras given @louaybassbouss 's explanation that the test runner takes 1 video MPD and 1 audio MPD, would merging video and audio into the same MPD ever be needed?

I understand from your last messages that recombining MPDs is not the preferred option. FYI in case it may help in the future the existing script already manipulates the XML/MPD.

louaybassbouss commented 3 years ago

@andyburras given @louaybassbouss 's explanation that the test runner takes 1 video MPD and 1 audio MPD, would merging video and audio into the same MPD ever be needed?

I understand from your last messages that recombining MPDs is not the preferred option. FYI in case it may help in the future the existing script already manipulates the XML/MPD.

@rbouqueau we need one MPD for video and one MPD for audio there is no need to recombine them. But if you have multiple MPDs for video (e.g. for the tests for splice conditioned content, 8.8, 8.14 and 9.4) then we need to combine them in single video MPD. In this case, the script that manipulates the XML/MPD can be helpful.

jpiesing commented 3 years ago

So for each element of the row 2) we have "1 encoding = 1 CMAF = 1 MPD" to generate one CFHD/AVC switching set with a single video track.

@rbouqueau Yes for streams <20 but I believe the streams >20 are only used in the switching set test.

In addition we have to generate the special "switching set 1" (previously X or X1) with is an aggregation of row 2 (with stream ids >= 20) (although there is also stream id 1 included but the rationale is that we are trying to minimize the number of streams) + some audio.

Yes.

jpiesing commented 3 years ago

What is left to do for this one? For 9.2, Regular Playback of a CMAF Presentation, we can use

dash.akamaized.net/WAVE/vectors/avc_sets/15_30_60/t1/

The same can be used for 9.3, Random Access of a WAVE Presentation.

For 9.4, Splicing of WAVE Program with Baseline Constraints, we can use the same content as 8.8 assuming the MPD for that includes both video and audio Adaptation Sets.

Is there anything else?

rbouqueau commented 3 years ago

If we consider this issues covers 15/30/60 than I think it is complete.

jpiesing commented 3 years ago

@andyburras As the original proposer, please can you review this discussion & see if anything is still open. If nothing is open, please close the issue. If anything is still open, please summarise so that we can address it & move forwards.

andyburras commented 3 years ago

I think things are much clearer now. To summarise:

9.x Presentation tests may either reference separate MPDs for video and audio, or may reference the same MPD for both.
Media proponents would be responsible for either scripts or external procedures to generate their codec variants, CMAF packaged with MPDs suitable for the existing 8.2 to 8.14 and 9.2 to 9.4 tests. Plus adding these to the "master" .csv file.
For 9.x tests, each video media profile would be tested with HE-AAC audio, and each audio media profile with AVC.
Not all resolutions and content options are required to be tested. E.g. for HE-AAC, E-AC-3 and AC-4 then testing with 1080p25 or 1080p30 AVC would suffice.

There are a couple of procedural queries going forwards, but maybe these belong better within the CSTF/DPCTF groups rather than the Test Content Generation github?

if new section 8 and 9 tests are added that have new test content requirements, and this requires amended generation scripts/procedures, how will this be handled?
What will be the procedure if in the future it is determined that the mezzanine content needs to change?

jpiesing commented 3 years ago

I think things are much clearer now. To summarise:

* 9.x Presentation tests may either reference separate MPDs for video and audio, or may reference the same MPD for both.

Agree.

* Media proponents would be responsible for either scripts or external procedures to generate their codec variants, CMAF packaged with MPDs suitable for the existing 8.2 to 8.14 and 9.2 to 9.4 tests. Plus adding these to the "master" .csv file.

Yes.

* For 9.x tests, each video media profile would be tested with HE-AAC audio, and each audio media profile with AVC.

Yes although there could be exceptions if an advanced media profile was never used in the real world with HE-AAC or AVC.

* Not all resolutions and content options are required to be tested. E.g. for HE-AAC, E-AC-3 and AC-4 then testing with 1080p25 or 1080p30 AVC would suffice.

Yes.

There are a couple of procedural queries going forwards, but maybe these belong better within the CSTF/DPCTF groups rather than the Test Content Generation github?
* if new section 8 and 9 tests are added that have new test content requirements, and this requires amended generation scripts/procedures, how will this be handled?

Someone makes a pull request against the existing scripts which is then reviewed by other people?

* What will be the procedure if in the future it is determined that the mezzanine content needs to change?

An issue is raised in the mezzanine content project which is resolved by 1) a pull request to change the mezzanine content scripts and then 2) the updated scripts being run & the resulting content being uploaded.

@andyburras Do you agree with my thoughts above?

andyburras commented 3 years ago

Someone makes a pull request against the existing scripts which is then reviewed by other people?
... the updated scripts being run & the resulting content being uploaded.

Is the expectation that media profile proponents will need to track any such changes going forwards, and make the necessary amendments to their scripts/procedures and test content?

jpiesing commented 3 years ago

Someone makes a pull request against the existing scripts which is then reviewed by other people?
... the updated scripts being run & the resulting content being uploaded.
Is the expectation that media profile proponents will need to track any such changes going forwards, and make the necessary amendments to their scripts/procedures and test content?

Ideally the scripts would be contributed to github and use publicly available tools so anyone could track such changes going forwards. If the media profile proponent has to use tools that are only available to them and/or cannot contribute the scripts to github then a discussion would be needed.

andyburras commented 3 years ago

Thanks @jpiesing I think this is a lot clearer now. Closing.

cta-wave / Test-Content-Generation

Test content media generation for DPCTF section 9 testing #25