data-* or data-aui-* - Githubissues

The document "Comparison of ways to use vocabulary in content" makes it obvious that the selection of prefix "data-" was done after a serious consideration of the options. Thanks!

However, when I presented the Personalization documents to a group of a11y people in Sweden, several of them objected to the use of "data-" as a prefix to be used for standardized semantics. The main argument being that there's a significant risk that a "data-*" attribute (eg "data-action") is used with an entirely different meaning, causing confusion.

In the "Comparison" document, at https://github.com/w3c/personalization-semantics/wiki/Comparison-of-ways-to-use-vocabulary-in-content#-data-set-data--or-data-aui--attributes the pros and cons of "Data set" attributes are listed under "data- or data-aui-", but in the draft recommendation only "data-" is being used. It appears to me that - although longer - "data-aui-" would be considerably less likely to be already used for some other purpose, and therefore a better compromise.

Unfortunately, all web-wide use of "data-" seems to be in conflict with the HTML 5.2 specification: "For generic extensions that are to be used by multiple independent tools, either this specification should be extended to provide the feature explicitly, or a technology like microdata should be used (with a standardized vocabulary).", and a bit further down (in Example 21): "these attributes are intended for use by the site’s own scripts, and are not a generic extension mechanism for publicly-usable metadata." https://www.w3.org/TR/html52/dom.html#embedding-custom-non-visible-data-with-the-data-attributes

Perhaps you already discussed these factors? If you didn't, and if it is not too late, I would suggest at least to change from data- to data-aui- in order to minimize conflicting semantics. Or even go for the microformat/itemprop construction if the W3C wants to use the momentum from ongoing accessibility regulation to further such generic frameworks for a more semantic web. Perhaps a question for the TAG?

Hello Pär,

[Speaking as an individual member of the Task Force:] As one of the folks actively involved in the discussions, I can confirm we are aware of the concerns you are raising, but are also following guidance provided to us from the Technical Architecture Group (TAG) at the W3C. We are actually using data-* in the fashion precisely envisioned when it was added to HTML 5.

One of our overarching goals is to try and have non-prefixed attributes as much as possible (so, for example, we want ultimately to have @purpose="" as opposed to @data-purpose="" or even @tbdprefix-purpose=""), but to achieve that, we have to first deliver some proof of concepts, and determine which if any/all of our proposed attributes might advance prefix-free. We believe we've got a great proposed solution here, but we have no evidence or proof that it's workable or scalable today - and that's a problem - and it's a chicken or egg problem as well (no tools because there is no solution, no solution because there are no tools: data-* helps break that vicious circle).

So, data-* is not intended for a "final" name, but rather an experimental name for now, as we gather-up the necessary implementation experience and user-feedback data as part of the maturing process. In many ways, it is similar to browser-prefixed CSS before they become "fully baked" or prefix free. (BTW, we also looked at using MicroData, but decided at this time it was overly complex to author at the element level.)

We also have precedent for this approach with ARIA, where a critical attribute - @role - advanced prefix free into HTML5, but other ARIA attributes remain prefixed with aria-. It is my personal hope that we'll be even more successful this time around, and that we'll get multiple attributes "folded in" directly into HTML 5! But to achieve that we have to proceed with that goal in mind. *

One concern I (at least) have early-on is to avoid 'ghettoizing' these attributes by using a "special" prefix (read: "for disabled people only"), as it has been my experience over the past years that an approach like that is a barrier to adoption more broadly. I think we all understand that not all of our proposed attributes will advance prefix free, but rather than pre-judging that, we'd rather (at least I'd rather) presume all will go through prefix-free, and then later being told "...We can only really justify 60% of your attributes advancing prefix-free..." after which THEN we can have the discussion with the browser vendors and other W3C stakeholders to choose a better prefix (which MAY be one of either AUI-, or ARIA-, or TBC-). The reality is that we need other stakeholders to join us*, including the browser vendors and other tool vendors, and they aren't very inclined to take "dictated" solutions "...just because", so we're working within some constraints already.

Finally, it is also worth remembering that TAG understood these goals, and thus suggested we start with @data-* 2 years ago at our annual meetings (TPAC 2018)

Note that this is not a 100% consensus position at this time - we are actually still discussing this within the Task Force. But we've gone around the block on this topic more than once, and I suspect that at this time, we want to move forward with what we have and start soliciting wider review and feedback, so your observations and comments are both welcome and will be taken into account going forward. (So THANK YOU!) Hopefully this helps you better understand "the plan" (at least as I understand it) regarding the naming pattern both short-term and longer-term.

On Thu, Feb 20, 2020 at 3:31 PM Pär Lannerö notifications@github.com wrote:

The document "Comparison of ways to use vocabulary in content" makes it obvious that the selection of prefix "data-" was done after a serious consideration of the options. Thanks!

However, when I presented the Personalization documents to a group of a11y people in Sweden, several of them objected to the use of "data-" as a prefix to be used for standardized semantics. The main argument being that there's a significant risk that a "data-*" attribute (eg "data-action") is used with an entirely different meaning, causing confusion.

In the "Comparison" document, at https://github.com/w3c/personalization-semantics/wiki/Comparison-of-ways-to-use-vocabulary-in-content#-data-set-data--or-data-aui--attributes the pros and cons of "Data set" attributes are listed under "data- or data-aui-", but in the draft recommendation only "data-" is being used. It appears to me that - although longer - "data-aui-" would be considerably less likely to be already used for some other purpose, and therefore a better compromise.

Unfortunately, all web-wide use of "data-" seems to be in conflict with the HTML 5.2 specification: "For generic extensions that are to be used by multiple independent tools, either this specification should be extended to provide the feature explicitly, or a technology like microdata should be used (with a standardized vocabulary).", and a bit further down (in Example 21): "these attributes are intended for use by the site’s own scripts, and are not a generic extension mechanism for publicly-usable metadata." https://www.w3.org/TR/html52/dom.html#embedding-custom-non-visible-data-with-the-data-attributes

Perhaps you already discussed these factors? If you didn't, and if it is not too late, I would suggest at least to change from data- to data-aui- in order to minimize conflicting semantics. Or even go for the microformat/itemprop construction if the W3C wants to use the momentum from ongoing accessibility regulation to further such generic frameworks for a more semantic web. Perhaps a question for the TAG?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/w3c/personalization-semantics/issues/137?email_source=notifications&email_token=AAJL445G6HQE7VMQ3AU6CV3RD3ZDZA5CNFSM4KYXIM32YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IPD5MGQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJL444Y6CVOVDF655NJ4PDRD3ZDZANCNFSM4KYXIM3Q .

-- John Foliot | Principal Accessibility Strategist | W3C AC Representative Deque Systems - Accessibility for Good deque.com

Thanks John for an excellent clarification!

Fyi: I received the comments regarding the prefix when sharing a blog post about the Personalization documents that I wrote a few days ago. The blog post contains some more comments on the specifications, if you’re interested: https://www.metamatrix.se/aktuellt/invisible-web-design-colors/

Looking forward to seeing your work being implemented!

Hi Pär,

That's a great write-up - thanks for helping spread the word!

It's interesting that you singled out data-purpose, data-destination, data-action and data-simplification, as those 4 attributes in particular are ones I personally can truly envision becoming prefix-free down the road, as they benefit so many more users than just users with disabilities thanks to the - as you noted - rich semantics via metadata. (Interesting tid-bit: a few of us were very actively involved with WCAG 2.1 SC 1.3.5 Purpose of Input - and work and discoveries on that SC was in part the driver for this new Task Force and Personalization activity.)

Your understanding of data-destination is pretty much accurate, but it is also seeking to use standardized terms for when designers want to get "creative" (eg: the "I'm feeling Lucky" button at Google would have a data-destination="submit" value, so that even though the label/accessible name of "I'm feeling Lucky" is exposed in the DOM, we're now also able to unambiguously tag that button with it's real purpose - which is just a filtered submit function). And to answer your question: at this time I believe the intent is to keep the registered taxonomy terms inside of the W3C: normative inside of the specification, but perhaps also hosted elsewhere by the W3C - there has been some tentative and early discussion on how to achieve that inside of the W3C.

A few other specific details: for data-symbol (and thanks for pointing out that there are different symbol sets out there) - what we've arrived at is to use the 'open-software' Bliss Symbol set as a baseline, but the value for data-symbol will use the unique identifying number associated to each actual Bliss symbol - so that other symbol sets can map to the 'term' but use a common numeric identifier in the authoring of data-symbol.

Regarding data-distraction: when we first started to look at that, I too had many of the same concerns as you. But then others got me thinking about, and thinking deeper-on how this would be used in the real world. A number of potential use-cases surfaced, including IOT and user-interfaces: as we know, most IOT interfaces today are also based on HTML, whether in the browser, as part of an app, or increasingly onboard actual devices. I could envision UI simplification/removal of distractions there as being author supplied and hugely beneficial.

Another use-case would potentially involve a proxy-server in the middle, where users with known disability requirements could have a registered account, and then content could be modified (DOM injection, etc.) at the proxy server level and delivered to the registered user after that. (I could envision something similar for data-symbol, which would then extend far beyond just iconography, and actually be used for larger tracts of text

we've already seen a proof of concept from a Mozilla developer for that).

In other words, there are many 'edge-cases' out there that none-the-less impact thousands if not millions of users (even if not "mainstream") and so planning for both the most mainstream but also the less traveled paths makes sense, and was part of our goal.

I'll conclude by noting that we are a small group working on this, and while we've come a long way, there is still much to be done. With your experience and background in metadata I'm sure we'd welcome you to become more actively involved (if you are interested). In particular, we are fast approaching the need for PoC tooling, be they in-browser extensions, stand-alone apps or similar, and if you're interested in any of that please come join us :-) (Seriously, if you'd like to get involved, reach out. You can contact me at my work email of john dot foliot@deque dot com and I can help you get active in our work).

Thanks again!

On Thu, Feb 20, 2020 at 5:45 PM Pär Lannerö notifications@github.com wrote:

Thanks John for an excellent clarification!

Fyi: I received the comments regarding the prefix when sharing a blog post about the Personalization documents that I wrote a few days ago. The blog post contains some more comments on the specifications, if you’re interested: https://www.metamatrix.se/aktuellt/invisible-web-design-colors/

Looking forward to seeing your work being implemented!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/w3c/personalization-semantics/issues/137?email_source=notifications&email_token=AAJL44ZO5Q2TJP5WX4U4J2TRD4IZZA5CNFSM4KYXIM32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMQ5V5Y#issuecomment-589421303, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJL443RYOROKUMPZ4A5F5DRD4IZZANCNFSM4KYXIM3Q .

-- John Foliot | Principal Accessibility Strategist | W3C AC Representative Deque Systems - Accessibility for Good deque.com

Thanks again for taking the time to respond in depth! I did not consider IoT, but of course, that's a relevant aspect. Neither did I consider attribute injection by proxy servers. Lots of things to explore. Exciting area indeed! I'd be very happy to be more involved, but in the near future I can only make occasional contributions, due to other assignments.

Reading through all of this- As a web author, I prefer that data-* is left alone and something different is proposed. We've already been given data-* for author use and I think it's not backwards-compat to start using it in spec.

Since naming things is hard, I'd like to propose an alternative for consideration. What about purpose-* ?

@MelSumner Thank-you for the feedback.

Our current plan of using data- is based on the recommendation from the W3C's Technical Architecture Group (TAG) as a first step. The data- prefixing convention is used for experimental implementations per HTML 5, which is ongoing now.

Stepping back, it is our hope that we ultimately end up with attributes that have NO prefix (i.e. @purpose, or @destination) - that our attributes become "mainstream". We've seen this happen before with the ARIA attribute of @role (which does NOT have the aria- prefix: i.e we don't today write aria-role="", although once upon a time...)

As experimentation continues, we may find that some of our proposed attributes will indeed require prefixing, and we've discussed that possibility internally (aui- is a strong candidate) however our preference at this time is to presume* non-prefixed attributes down the road, and if/when we have to settle on a permanent prefix we'll have a separate discussion at that time.

Closing as no new comments have been receive in 3 weeks.

w3c / adapt

data-* or data-aui-* #137