cta-wave / technical-working-group

7 stars 0 forks source link

Will WAVE address the issues that make it hard for HbbTV/ATSC to incorporate WMAS directly? #16

Closed bobcampbell-resillion closed 3 months ago

bobcampbell-resillion commented 3 years ago

Both HbbTV and ATSC 3.0 (NEXTGEN TV) testing would reference the WMAS tests. But

(a) there's no agreed way in which such regimes could include all the WMAS:YYYY tests because no device today would pass them all. (b) there is no "fixed and stable" test set, because they use W3C materials that are not versioned and could be updated/changed at any time, (c) there isn't much formal review (as I understand it) of tests themselves, or their inclusion in the WMAS set. Its just all tests that passed on the main browser stacks, whether they're good/useful or not.

This makes it hard for those programs to refer to the WAVE tests as is. Device development and QA and release and update cycles aren't as dynamic as web apps/browsers can be. So Device conformance programs tend to like a "test set X.Y" to apply from date X until a sunset /sunrise of the next release.. They like to be able to raise challenges, have buggy tests removed etc. W3C won't do this. These issues are likely to trip up other programs (I imagine). WAVE could leave it to each organisation referencing this set to work out how they want to deal with these issues. Or WAVE might consider if it can do anything to help address these issues in general...

Hint: if left to NEXTGEN TV, at least, inclusion of the tests in that program might not happen for some time, as working out how to solve these problems isn't a very high priority right now. Obviously this much impairs and delays the impact of the work WAVE has done.

jpiesing commented 3 years ago

(b) there is no "fixed and stable" test set, because they use W3C materials that are not versioned and could be updated/changed at any time,

WAVE has taken a snapshot of the WPT so this won't happen.

bobcampbell-resillion commented 3 years ago

I thought it was just the runner that had been 'snapshot' and modified, and the 'snapshot' was the APIs applicable in the spec, and hence which tests would be included, not the test implementations themselves?

The reason for thinking this is in the instructions page (https://github.com/cta-wave/WMAS)

Download test files according to WMAS2018 specification, call from WPT root directory:

$ ./wmats2018-subset.sh

https://github.com/cta-wave/WMAS/blob/master/wmats2018-subset.sh seems to get the latest test implementations from github, which may be my misunderstanding of how that works?

Update more closely reading the code and what git commands are used I see it is probably checking out at a particular branch or commit ID - so would guess that does mean it is only getting them at some point in time.

I think all that could be better documented because its not at all clear; I still don't think it means that there's a way HbbTV or NEXTGEN TV can incorporate WMAS:20XX in a formal certification regime "as is", so my other points are still unresolved...

JohnRiv commented 3 years ago

Regarding (b):

would guess that does mean it is only getting them at some point in time

Yes, that is how the WMATS is created

I think all that could be better documented because its not at all clear;

Agreed. We could specifically call out the WPT git commit we fork from, along with the date of that commit.

jpiesing commented 3 years ago

Here's a version of what I said in the May 26 meeting.

  1. Start from the tests that pass on all desktop browsers
  2. Eliminate all manual tests by default
  3. Eliminate by default all tests changed in the last 12-18 months (tbc)
  4. Review which APIs would not have any tests included after the above & include a small number after careful consideration.
  5. Run what's left on a number of media consumption devices and select the intersection.
poolec commented 3 years ago

That all sounds OK up until step 5. If I understand correctly, you're suggesting that the subset generated from steps 1-4 would be run on some media consumption devices and only those that pass on all of them be taken forward?

That seems of very limited value to the aim of improving interoperability since it would tend to exclude any test that actually finds a real defect in media consumption devices, including only tests that are of very little value because they pass on everything anyway.

Granted, it would at least set a bar such that future devices should not be worse than the ones that were used in the step 5. But that's a very unambitious aim, given WAVE's objective to "drive interoperability in streaming media services between content and media playback devices".

It would certainly be informative to run a proposed subset of tests on a range of media consumption devices and take a look at the reasons for failures. I can imagine some of them might fail because the test is not appropriate to those devices but others will fail because of defects in those devices that should be rectified. If someone wants to exclude tests that get through steps 1-4, they should be required to justify why those tests should be excluded, not just have them excluded because they just happen to fail on one of their existing products.

bobcampbell-resillion commented 2 years ago

I think this is mainly about (a) and (c) now, and see also https://github.com/cta-wave/WMAS/issues/70

JohnRiv commented 2 years ago

We can discuss this at an upcoming call with all interested members

cta-source commented 2 years ago

Referring back to something @bobcampbell-eurofins wrote on 4/30 (top of this issue):

Hint: if left to NEXTGEN TV, at least, inclusion of the tests in that program might not happen for some time, as working out how to solve these problems isn't a very high priority right now. Obviously this much impairs and delays the impact of the work WAVE has done.

I want to make sure the context is clear, for NEXTGEN. As far as I can tell (Bob, check me?), the ATSC 3.0 Broadcaster App has to do "web app stuff" like render itself, watch for user input, etc. It also uses a JSON API unique to ATSC 3.0 and specified in their A/344 standard.

The focus in NEXTGEN, as far as I can tell, is ENTIRELY about checking the JSON API conformance on the device. A Broadcaster App must be able to exercise each part of this API on a NEXTGEN-compliant device, so they test that. They are not big on testing for broad UA compatibility like we have in WMAS.

WAVE compatibility isn't a thing for ATSC, but they are interested in what we can bring to the table. Some of WMAS, like Service Worker API support, isn't needed for Broadcaster Apps. ATSC is willing to look at a list of what we think should be required on the UA compatibility side, but only for Broadcaster Apps.

This situation is very different from HbbTV.

cta-source commented 2 years ago

As far as APIs required by ATSC 3.0 Broadcaster Apps (BAs, the subject of NEXTGEN smart TV conformance testing):

In general, BAs include HTML5, JS, CSS, XML, image, A/V files delivered to receiver one-by-one or in pkgs.

BAs are also required to support MSE & EME. Broad W3C compatibility (equivalent to a downloadable-browser implementation) isn't explicitly specified.

bobcampbell-resillion commented 2 years ago

The focus of the NEXTGEN TV logo/certification is certainly not on broader W3C compatibility at the moment. But I disagree with:

WAVE compatibility isn't a thing for ATSC,

That isn't aligned with the ATSC Spec, as A/344:2021 contains the following

5.2 User Agent Definition Receivers shall implement an HTML5 User Agent that complies with all normative requirements specified in the CTA Web Media API Snapshot (CTA-5000-B) [7].

(my emphasis) and normative reference [7] is

[7] CTA: "CTA Specification: Web Application Video Ecosystem – Web Media API Snapshot 2019", Doc. CTA-5000-B, Consumer Technology Association, December 2019.

That doesn't leave much room for interpretation; on the face of it, Broadcaster Applications shall therefore be able to rely on all of WMAS being supported in the user agents they launch within. In turn that implies all WMAS:2019 tests shall be passed by any ATSC 3.0 A/344 compliant receiver. Of course, applications should adapt to the capabilities of the device they find themselves on, and of course the corresponding CTA Recommended Practice CEB-32.8 2020 makes this all "should":

ATSC 3.0 Television Sets that support the optional Application Runtime Environment should support all User Agent features defined in A/344 [3], section 5.2, which references CTA-5000-A [5].

Nevertheless it seems pretty clear application developers writing Broadcaster Applications for an ATSC 3.0 compliant receiver might well expect WMAS testing to form part of NEXTGEN TV certification, even if the focus for that program, right now, is elsewhere. If these extracts don't reflect the expectation of receiver manufacturers or application developers then that sounds like a matter for ATSC's group drafting A/344 and CTA's NEXTGEN TV logo scoping team...

As far as WAVE is concerned, for applications to be able to rely on the presence of a common set of capabilities in their target User Agent, then even if the requirement was softened somewhat, the question remains how could a device conformance regime use WMAS:2019?

It is my understanding:

This seems to me a problem any referencing logo conformance platform will have in including WMAS, and is probably the very reason "WMAS compatibility isn't a thing for ATSC". Its not actually practical for any other conclusion to be made, at this point in time, and ATSC/NEXTGEN TV are focussed elsewhere so won't/don't have bandwidth to address this. So, it seems to me WAVE ought to take more of an active role in resolving these questions, else divergence is the likely consequence if every (2?) referencing platform comes up with its own "subset"?

cta-source commented 2 years ago

First, my phrasing was a bit unclear. I should have written, "WAVE compatibility isn't a priority for ATSC members at this time." This is a key point, the ATSC specifications indeed require compatibility but I don't believe there is much effort going into making it happen, and there appears to be a certain amount of hesitation in trying.

It's hard to blame them, architecturally, Broadcaster Apps (BAs) don't run in a general-purpose W3C context. They don't need full WMAS compatibility.

few, if any, devices in the market today could be expected to pass all the tests "that pass on all 4 main browser stacks"

I'm sure you know this, but for others: Please note that failure of NEXTGEN certification means not being able to advertise as a NEXTGEN compatible TV. TV makers aren't going to risk that for WAVE compatibility.

Our options are,

The first is a non-starter--no one on the ATSC side seems to be too worried about full WMAS compatibility for a constrained architecture that doesn't actually need full compatibility. That is, as pointed out above, their architecture doesn't necessarily require everything in WMAS and there doesn't seem to be a reason to demand it. For example, service workers may be an easy item to give up for their purposes.

its not even realistic to expect 100% pass rate "in the near future", and if not, what is an objective pass criteria for this testing?

"Objective pass criteria": This is why ATSC, via CTA's ATSC representative Paul Thomsen, has repeatedly requested a list of APIs that WAVE believes are necessary for the functionality needed in Broadcaster Apps. This is obviously different from full A/344 specification compliance.

are the manual tests to be included or not? do they work on a TV interface?
are all the tests appropriate for a user agent running on a device with constrained resources, e.g. no tabs?

These are good questions that WAVE should discuss with ATSC in the context of a constrained list for BAs (let's say, "ATSC BA Profile", maybe?)

its not at all clear what a tester, landing upon a WMAS launch page, should select or expect from a certification test run using WAVE's test materials...

A good topic once we get to an ATSC BA profile. There are more good questions in your comment, but we need to decide if we're going to go ahead with ATSC's proposal and work out a profile of WMAS for ATSC3.0. Then we can consider the process points.

So I think we should proceed with an ad-hoc of WAVE and ATSC folks working out what in WMAS is architecturally required for BAs and what is not (the profile).

bobcampbell-resillion commented 2 years ago

[ATSC Broadcaster Apps] don't need full WMAS compatibility. This is why ATSC, via CTA's ATSC representative Paul Thomsen, has repeatedly requested a list of APIs that WAVE believes are necessary for the functionality needed in Broadcaster Apps.

I think this is where the most interesting discussion should happen in WAVE. Forgive me for restating what I hope is commonly understood: the WAVE profile specification is a subset of the W3C APIs intended for media centric applications, running on devices. Its based on the set of tests that pass the 4 main browser stacks. I'm paraphrasing the boilerplate introduction to the WMAS spec.

Therefore WMAS is a target, not necessarily an objective minimum and I think this is where the difficulty lies...

I don't know how WAVE can answer what of that WMAS subset is needed for ATSC Broadcaster Apps, other than to say "The WMAS spec is WAVEs recommended profile of W3C APIs a media centric application running on a device like a TV, would need/should expect".

That is, the list Paul is asking for from WAVE, is the WMAS spec, as far as WAVE is concerned....

If ATSC 3.0 Broadcaster Applications need fewer APIs than that, ATSC probably has to define what it thinks it needs, based on the Broadcaster Applications its expecting to see? I don't know how WAVE can anticipate the scope of ATSC BAs?

I think coming up with a test scope that is somehow more practical to address the <100% compliance in the real world is subtly different to an API subset, but yes I agree we ought to start with agreement on whether the WMAS specification is really correctly referenced by A/344. A/344 requires full compatibility, so that can't be true at the same time as the statement that they don't need full compatibility ;)

If both referencing specifications (HbbTV 2.0.3 and ATSC A/344:2021) don't need full WAVE WMAS Spec conformance, then the next question is whether they need the same, or different subsets, and how to define those. My starting point with this thread was that I didn't think the industry as a whole is well served by further divergent profiles unless specific (and probably legacy) API differences were inherent in the two environments.

I agree and I hope the forthcoming joint call will help shed some light on these perspectives, both what WAVE's profile is intended to achieve, and the constraints in both HbbTV and ATSC that mean the tools and tests provided by WAVE can't be used "as is" today...

cta-source commented 2 years ago

Only addressing the WAVE-ATSC question:

Regarding whether a list can be made by WAVE alone, or ATSC alone, I agree, that's not realistic. However, some work together would be productive. I've reviewed the API list with an eye towards what I think ATSC would need, and it's "most of it" but not "all of it".

Why "most of it"? Because BAs are pretty capable, with audio/video/captions expected.

Possibly the most important point is that ATSC doesn't know what they don't need. This gap introduces a FUD factor that keeps ATSC folks (in my experience) from thinking about WAVE as something useful. For example: Does WAVE require smart TVs include mouse support? Of course, no. So I've explained that mouse support is conditional on choosing to implement a mouse in the first place--WMAS does not require a mouse, only that if you implement a mouse, you need to be compliant with the specs (a.k.a. "conditional mandatory").

ATSC will have some process considerations if they decide to use WMATS, like what to do about accepting and reviewing wavier requests, what does "conformant" mean in this context, etc. These considerations aren't all that important at this stage since it doesn't seem we have a lot of ATSC-side folks really considering the details in WMAS in the first place. (If I'm wrong, please advise, but I believe "make ATSC a list" is the current status.)

Bottom line: I agree with many of the points in the comment above, but I believe we need to try to produce this list to reduce the uncertainty the ATSC folks have in even attempting to work with WAVE.

JohnRiv commented 11 months ago

Given we have closed https://github.com/cta-wave/WMAS/issues/62 & https://github.com/cta-wave/WMAS/issues/63, I believe this issue can now be closed as well

JohnRiv commented 9 months ago

@bobcampbell-resillion any further actions you see needed here or can we close this?

bobcampbell-resillion commented 9 months ago

The problem of launching the tool via a stream was indeed fixed a while ago, but that didn't address all my concerns...

We've been trying to integrate the tool into HbbTV and ATSC testing and the remaining stumbling block is https://github.com/cta-wave/WMAS/issues/78 - the ability to define and execute a subset that those organisations will use in certification depends on being able to exclude large numbers of tests from the WMAS set.

So I think this broader issue remains open until a HbbTV logo regime or ATSC NEXTGEN TV can include a WMAS subset in certification...

Is there now a tagged and approved version of each year of the WMAS tools that would be appropriate to reference in a certification regime, as referenced by their respective Specs? And what is the update/release cycle?

The last tagged 2018 release, which is the WMAS version referenced by HbbTV Spec v2.0.3, which is 'current' for devices entering the market as of Nov 2023, was https://github.com/cta-wave/WMAS/releases/tag/wmas2018-v1.0.2 in 2021 and that indicates 7 commits since... at least one looks like it should be relevant to HbbTV users. So what should be being used by people in certification regimes - the 'main' branch - what guarantee do we have that won't change in future?

Test tools for certification regimes need to be stable versioned items just like the test materials.

The third part, review, will implicitly happen via HbbTV and NEXTGEN certification applying, that's when eyes will be on the tests people are expected to pass to get a logo.

P.S. all this coming to DPCTF test suite soon, too. see #15 :)

JohnRiv commented 9 months ago

Thanks for the feedback, @bobcampbell-resillion! Regarding your 2 questions there, I would say the update/release cycle is "as needed" so we can release bugfixes & feature enhancements as quickly as they are ready, but if a more predictable/formal release cycle is required, let us know.

Regarding a tagged and approved version of each year of the WMAS tools that would be appropriate to reference in a certification regime, I think it would be useful to link to each available release from either the README of the https://github.com/cta-wave/WMAS repo's main branch or the README of the individual versions (e.g. https://github.com/cta-wave/WMAS/blob/wmas2018/README.md) along with a changelog of the changes between releases of the same WMAS year. I assume the main repo is easiest/best but would defer to @FritzHeiden & @louaybassbouss on that.

The current full list of official releases in the Github Repo is

and looking at all the branches, it appears we likely should cut new releases for:

If we go ahead with creating those releases and the "as needed" release cycle is sufficient, is there anything else ATSC or HbbTV needs?

yanj-github commented 8 months ago

I would suggest to update/release cycle is required to deploy as well, https://github.com/cta-wave/WMAS-deploy. If the latest changes to the WMAS-deploy does not work with the specific release of WMAS it would be difficult to manage.

louaybassbouss commented 7 months ago

I would suggest to update/release cycle is required to deploy as well, https://github.com/cta-wave/WMAS-deploy. If the latest changes to the WMAS-deploy does not work with the specific release of WMAS it would be difficult to manage.

@yanj-github for deploy we already have a branch for WMAS version

JohnRiv commented 7 months ago

@yanj-github is the branch for each version sufficient? Or do you need something additional around actual releases of the WMAS-deploy repo?

yanj-github commented 7 months ago

@yanj-github is the branch for each version sufficient? Or do you need something additional around actual releases of the WMAS-deploy repo?

I think it would be better to have it versioned as well. It is important for user to know a stable version for the server including deploy code as well as the WMAS source. Deploy code might not impact on the test result directly but it might change the way the server can be connected / set up.

bobcampbell-resillion commented 7 months ago

Agreed that the cycle WAVE has for its releases doesn't have to be synced with any downstream 3rd party, just worth knowing that both ATSC NEXTGEN and HbbTV work on 3 major test suite updates per year (Q1, Q2, Q3) and would at that point normally update their references to the "latest WAVE tools and tests" release.

Also worth noting that update on the "head" of the test suites doesn't mean the older versions get sunset immediately, indeed in HbbTV many logo regimes which reference the HbbTV suite can be several versions behind the "head". For NEXTGEN its similar although only 1 logo in effect at the moment.

It is important to version the entire "test environment" being referenced because if a change in result occurs between two releases, manufacturers will want to know what in the tests or environment has changed, to eliminate that before looking at their implementation....

JohnRiv commented 6 months ago

Thanks. I'd like to propose to help make it clear which releases of both WMAS and WMAS-deploy exist and should be used together that we update the table at https://github.com/cta-wave/WMAS/blob/main/README.md to be the following:

version spec source branch docker deploy tests branch docs
WMAS 2021 WMAS2021 latest 2021 source latest 2021 deploy 2021 tests 2021 docs
Source v1.0.0 - Deploy v1.0.0 source v1.0.0 deploy v1.0.0
WMAS 2020 WMAS2020 latest 2020 source latest 2020 deploy 2020 tests 2020 docs
Source v1.1.0 - Deploy v1.0.0 wmas2020-v1.1.0 wmas2020-deploy-v1.0.0
Source v1.0.0 - Deploy v1.0.0 wmas2020-v1.0.0 wmas2020-deploy-v1.0.0
WMAS 2019 WMAS2019 latest 2019 source latest 2019 deploy 2019 tests 2019 docs
Source v1.1.0 - Deploy v1.0.0 wmas2019-v1.1.0 wmas2019-deploy-v1.0.0
Source v1.0.0 - Deploy v1.0.0 wmas2019-v1.0.0 wmas2019-deploy-v1.0.0
WMAS 2018 WMAS2018 latest 2018 source latest 2018 deploy 2018 tests 2018 docs
Source v1.1.0 - Deploy v1.0.0 wmas2018-v1.1.0 wmas2018-deploy-v1.0.0
Source v1.0.2 - Deploy v1.0.0 wmas2018-v1.0.2 wmas2018-deploy-v1.0.0
Source v1.0.1 - Deploy v1.0.0 wmas2018-v1.0.1 wmas2018-deploy-v1.0.0
Source v1.0.0 - Deploy v1.0.0 wmas2018-v1.0.0 wmas2018-deploy-v1.0.0
WMAS 2017 WMAS2017 latest 2017 source n/a 2017 tests 2017 docs

Note we still need to cut the releases I mentioned above (hence the lack of links for those)

My hope is that table would allow ATSC and HbbTV to be able to reference a specific Source & Deploy.

It seems that we don't need a strict schedule to issue new releases when commits are made, but I would like to establish a recommended schedule to consider making official releases, and releases could be made at other times as appropriate. I'll suggest the 7th day of each quarter (Jan 7th, April 7th, July 7th, October 7th), and welcome other ideas.

bobcampbell-resillion commented 6 months ago

@yanj-github FYI see table above, we need to make sure we keep HbbTV test tools in sync with this.

louaybassbouss commented 6 months ago

@JohnRiv FYI for WMAS 2017 we don't have docker deployment since it was using the old Technology stack (Node.js, ...). Agree with you regarding the schedule for new releases. We will tag the WMAS versions (src, deploy) which are not tagged yet. Question if is required to keep tag version of deploy with tag version of source?

bobcampbell-resillion commented 6 months ago

FWIW I don't think anyone in HbbTV or ATSC needs to use 2017.

I think ideally there would be a clear baseline version of the tools and tests - i.e. a consistent environment - that one can use to ensure every device completing conformance for a logo scheme has the same expected results. If that involves them setting up from the "deploy" version then it should be tagged. Otherwise you are asking users to build from source.

louaybassbouss commented 6 months ago

@bobcampbell-resillion agree regarding 2017. Regarding deployment, deploy will also builds from source but much easier since it is using docker. What I am thinking about is, if we have a new tag in WMAS src, we can update deploy to point to the tagged source and tag deploy with the same version as well. This make the mapping easier to remember.

JohnRiv commented 6 months ago

Also agree no need to update 2017 to have a Docker Deploy, that's why I marked it as n/a.

On the prior HATF call, we discussed keeping the WMAS and the Docker Deploy version numbers in sync, but it was mentioned that it would result in version bumps for the other platform that would have no changes, which is why I put together the table. However, if we feel having them in sync is useful, numbers are cheap 😉 , so I think it is worth considering as well. The table at least gives us a good sense of the current state.

JohnRiv commented 6 months ago

In the HATF call on Feb 21, 2024 call, it was mentioned that updating the source could result in a README change for the deploy to reference the updated source, which is a good case to keep the source and deploy tag versions in sync.

If anyone objects to keeping the source and deploy tag versions in sync, please comment here.

FritzHeiden commented 5 months ago

Tags for source and deploy of WMAS2018 - 2021 are now in sync

FritzHeiden commented 5 months ago

@JohnRiv I added tags for older versions

JohnRiv commented 4 months ago

@bobcampbell-resillion we've made the updates to the tags and to the README at https://github.com/cta-wave/WMAS

Anything further you are looking for or can we close this issue?

bobcampbell-resillion commented 3 months ago

For reference, we've incorporated WMAS (*) into both our commercial HbbTV harness, "Ligada" and ATSC 3.0 harness "Arreios".

(* A "subset" of the full set is still to be approved formally by either organisation for use in formal logo cert programs)