whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
8.18k stars 2.71k forks source link

Consider aligning WHATWG main element definition with W3C definition #100

Closed stevefaulkner closed 6 years ago

stevefaulkner commented 9 years ago

The main element was designed and implemented based on the concept of there being a single instance within a document, the markup pattern was based on data of id value usage in the wild. The whatwg definition differs markedly from the orginal definition. This leads to confusion for developers. The W3C nu html checker, which is used by many, throws an error when main is used as per whatwg. Data derived from webdevdata.org shows that >97% of usage of the <main> element is as per the W3C definition and anecdata from users that consume the semantics suggests that one main element per page is the expected and most useful pattern. In general consumers landmark semantics report that the utility of landmarks is reduced as the number/instances of landmark elements in a document increases. The alignment would involve changes to the main and body element definitions. current W3C definitions:

sinabahram commented 9 years ago

What form would such new information take?

It is unclear to most of us what the necessary and sufficient criteria are for changing this decision.

Hixie commented 9 years ago

It's unclear to me why you would want to change the decision. What's wrong with what the spec says?

Heydon commented 9 years ago

@Hixie Here's a question: what would be so bad about changing the spec to say that only one <main> per page is advised? We've heard from numerous people on this thread who benefit from it being one <main> only and from nobody who'd prefer it as multiple <main>s. Let's just change the advice and save ourselves a lifetime of aimless quibbling.

domenic commented 9 years ago

from nobody who'd prefer it as multiple <main>s.

This is false.

Hixie commented 9 years ago

I've explained several times on this thread the advantages of the current design. Losing those advantages is what would be "so bad". But what would be "so bad" about keeping the current spec? As far as I can tell, it supports everything that the one-main-only model supports, including in particular the most important feature, namely, jumping to the first piece of main content in an AT environment.

marcysutton commented 9 years ago

Parts of this conversation sound a lot more about the opinions of spec writers than those of actual users (namely AT users, who stand to actually benefit from this element). People are telling you directly what they want and expect. Those messages are worth a lot more than you're giving them credit for.

karlgroves commented 9 years ago

@Hixie says: "including in particular the most important feature, namely, jumping to the first piece of main content in an AT environment."

From this, it appears that you acknowledge the actual use case for this element, yet you dig your heels in when the actual users that this benefits tell you that > 1 main element is a bad idea.

In fact, there's really only one user population that benefits from this element and that's users of AT. What more data do you need than the voices of these users?

The arrogance displayed in acting as though you know more about what they need than they do is actually comical.

Hixie commented 9 years ago

It's not the only population that benefits from the element. Authors also benefit from it, for styling purposes.

But that's besides the point. People can make claims all they like, what matters is whether those claims are backed up. So far nobody has actually explained a problem with allowing authors to have multiple parts of the page marked with <main>. Indeed, it seems based on the descriptions of AT behaviour that ATs handle that fine. (Even if they just ignored all but the first, it's hard to see where the harm would be -- it would be just like allowing one, and then saying all the others are nothing but styling hooks.)

Let me ask a different question which maybe would help. Suppose we required that there only be one <main> per page. What do you think ATs should do with the second <main> in a page? What should ATs do with the <main>s found in seamless iframes?

karlgroves commented 9 years ago

Authors also benefit from it, for styling purposes.

That's a useless red herring. God forbid they'd have to learn WTH they're doing and choose another element. There's no styling benefit from <main> over any other element that has a flow content model like, I dunno, <div>.

So far nobody has actually explained a problem with allowing authors to have multiple parts of the page marked with <main>.

I'm pretty sure that's not true, but I'll entertain you anyway. <main>, as a sectioning element, is mapped in accessibility APIs to "grouping" roles - for instance, 'ROLE_SYSTEM_GROUPING' in MSAA + UIAExpress (See http://rawgit.com/w3c/html-api-map/master/index.html#el-1)

This is why assistive technologies treat <main>, <nav>, etc. as navigable landmarks. It bears mentioning that all accessibility APIs use similar mappings for <main>. In other words there is (ostensibly), consistency across platforms.

What's wrong with > 1 main? Predictability. With each additional <main> the usefulness of the element decreases. More mains === more problems. We saw this with the <section> element, in fact. When HTML5 started gaining popularity, web developers thought "Oh, I'll just replace my <div>s with <sections>!!! I HAZ HTMLFIVED MY WORKZ!!!!11" and where we used to see pages composed of <div>s nested 27 deep, they became <sections> nested 27 deep.

Ultimately, AT vendors do what AT vendors frequently have to do, they compensated for developer idiocy by no longer announcing the <sections>

Back to the case of <main>, whether you want to admit it or not, the meaning of the word "main" is clear. While I feel comfortable calling you arrogant, I'll never call you stupid. You know what "main" means and I know you know what it means. There should be one. When there is > 1 its utility is diminished each time a new <main> is introduced. Have too many and it goes from being useless to being confusing mess, just like <section>

Hixie commented 9 years ago

I don't understand how the utility of <main> is diminished if there's more than one. The utility is "jump to the main content". So long as ATs jump to the first main, then the author can put in a million of them and it won't make any difference.

The advantage comes from the fact that if their page mixes "main" content and "boring" content, then they can mark up all the disjoint parts that are "main" so the user can easily jump between them. If ATs start ignoring the other mains and only allow jumping to the first one, then... nothing is lost if you only even wanted to honour one anyway. I really don't see the problem.

BTW, relying on authors to get any of this right is a lost cause, whatever we say in the conformance rules. Conformance rules are only relevant to the few authors who care about conformance. Implementations, including ATs, have to deal with the real world regardless. Case in point, <section> was never allowed to be used in place of <div>, yet q.v. what you described.

karlgroves commented 9 years ago

I really don't see the problem.

Of course not, which is why you've consistently been on the wrong side of accessibility since the very beginning.

OK, so you have 1 main: 1 main is easy. It should be the single area where the main content is.

What about 2? Well other than the fact that > 1 main means neither one is actually "main" now the user doesn't know which is which. But lets give the benefit of the doubt and say there could be an extreme edge case for two main areas: Which is which? Which main represents what stuff? Well, we could use aria-label or aria-labelledby to label each one. Differentiating between two is pretty low cognitive load.

3? Well three is just stupid because now things are getting out of hand. No rational person would think there could be 3 "main" parts of the page. But let's keep going with the labeling. No matter what you say from 3 and up, the "Principle of Least Astonishment" is irreparably broken in this interface. Now the multiple mains are working against the user. Because a screen reader user has no visually-oriented cognitive model of the page, they're faced with guessing which is actually the "main" of all the "mains"

So long as ATs jump to the first main, then the author can put in a million of them and it won't make any difference.

Come on. You're smarter than that. Any time an AT has to guess what's going on, you're introducing a lack of reliability. For years AT vendors have had to resort to this sort of thing and it rarely works out as planned. Since we're tossing out red herrings: What if I replaced all my <div> elements with <main> elements?

BTW, relying on authors to get any of this right is a lost cause,

Now that is hilarious. See, I remember having this exact argument with you back in 2006, when (at the time) you wanted to strip out the @scope attribute from TH, among other things (like making @alt optional).

You can't have it both ways. You can't say we must get rid of feature x because authors get it wrong, but keep feature y the way it is because they get it wrong. What is obvious in this case is that saying "multiple mains are a bad idea but you can still do it" is far less clear than "main: use one and only one, always"

Again, you have actual constituents for this feature telling you that > 1 main is bad and you refuse to admit you're wrong.

domenic commented 9 years ago

Karl, please do not resort to as hominem attacks on this issue tracker.

Can you explain in more detail, with examples from real ATs, the harm caused be more than one main? You say it rarely works out as planned, in which case such an example should be easy to produce.

karlgroves commented 9 years ago

Can you explain in more detail, with examples from real ATs,...

Oh, this isn't my first dance. I've played that game before. I stopped playing it years ago, back when WHATWG and W3C tried playing nice with each other.

The cycle usually goes like this: "We want to do this thing" "You shouldn't do that thing" "We're gonna do that thing unless you give us some data to prove we shouldn't do the thing" "Here's the data. See, it shows you shouldn't do the thing" "We're doing the thing anyway because of [fallacious logic]" "But we gave you data." "We're doing the thing. We found data that shows we're actually right" "Where's your data?" "This is proprietary data, we can't/ won't/ don't need to show you." "But we gave you open data, you can verify for yourself that our data is legit." "We don't like your data. We're doing the thing anyway. Here, smell this red herring. But hey, if you have more data, please share it. "

Here's the only data you need:

  1. The word "main" is semantically clear. I know it, you know it, and everyone else knows it.
  2. The "main" element's primary constituents are AT users.
  3. A half-dozen of those constituents have said in no-uncertain terms that they want one-and-only-one "main" element.
  4. Another half-dozen accessibility consultants - people who deal with AT users on a daily basis - say that there should be one-and-only-one "main" element.

Please, by all means, continue to argue that you know better despite your clear lack of perspective, knowledge, or experience in dealing with the primary constituents for this feature.

bkardell commented 9 years ago

Let me reiterate that I am taking no position here, but as a somewhat neutral observer I think it is worth mentioning that @Hixie asked what seems, to me anyway, as a pretty rational question:

Let me ask a different question which maybe would help. Suppose we required that there only be one <main> per page. What do you think ATs should do with the second <main> in a page? What should ATs do with the <main>s found in seamless iframes?

and I don't see anyone offering an answer to that. We know for a fact this will happen, it happens with everything, even <html> and <head> etc -- and it's not just authors -- CMS' and mashups especially frequently help create some pretty horrific things. I'm asking because, I could be wrong about this, but it seems to me at the heart of @Hixie's objection. Everything we have like this is set up for multiples - id doesn't identify uniqueness in practice, it's the first one in the document if you get it with getElementById or querySelector, for/describedby/etc all have to take these same things into account. I think what Hixie is saying is that AT could do the same with <main> now if the a11y community thought that was the right thing to do, but apparently they don't. This too though seems a bit double-edged - if the a11y community convinced ATs that they should treat the first <main> specially, would this sway you at all @Hixie or would that really accomplish nothing because the AT is dealing with it, as you said. This said, ID is defined as a "unique identifier" in the WHATWG standard, even though it doesn't actually work that way for reasons above and I think that all the a11y people here are saying is that they want the same kind of advice/description consideration here.

karlgroves commented 9 years ago

I don't think anyone's advocating for breaking the Web. One of the best features of the Web is that user agents are particularly adept at dealing with horrible markup.

The request is, quite simply, to become aligned with the W3C specification of <main> http://www.w3.org/TR/html51/semantics.html#the-main-element

jameswillweb commented 9 years ago

It seems to me that it would be pretty simple to write error-handling for those edge cases without breaking anything. In the case of nested <main> elements you could give primacy to the one closest to the root, in the case of sequential you could recognize the first element as the "main" element and treat all other instances as generic content groupings. Karl is correct, it's an issue of semantics not that authors may incorrectly implement it. "Main" has really only one possible semantic meaning, if we ignore that because of the possibility that multiples could be used in error than what's the point of adding semantics to specifications at all?

Hixie commented 9 years ago

Consider a page like the following:

[boilerplate text]
main content
[advertisement]
main content
[sidebar]
main content
[footer]

What would you wrap in the <main> element?

For an example of such a page, just go to cnn.com, and follow the link to the main story. It's a blob of content, the main article, with inline sidebars all the way along.

The word "main" is semantically clear. I know it, you know it, and everyone else knows it.

Right. It means the main content. But the main content can be dotted around the page. There's no one place on the page that has the main content.

The "main" element's primary constituents are AT users.

This just isn't true. The <main> element's definition doesn't have anything to do with ATs or AT users. It's like saying that <aside>'s main constituents are AT users. I mean, sure, it's a semantic element and semantic elements are eminently useful for being interpreted by non-mainstream renderers in a useful way, but that's the whole point of having semantic markup in general. It doesn't mean that HTML's main constituents are AT users.

A half-dozen of those constituents have said in no-uncertain terms that they want one-and-only-one "main" element. Another half-dozen accessibility consultants - people who deal with AT users on a daily basis - say that there should be one-and-only-one "main" element.

The WHATWG does not base decisions on volume of agreement, nor does it base decisions on what any particular group of people think. Decisions are based on rational arguments, of which there have been remarkably few in this thread.

Well other than the fact that > 1 main means neither one is actually "main" now the user doesn't know which is which.

The user knows exactly which is which. The first one is the first one and the second is the second. I really don't understand what you're trying to say here.

Any time an AT has to guess what's going on, you're introducing a lack of reliability.

There's no guesswork here. There's just a list of places in the document that the author has marked as being important. This is exactly how ATs work today. There's no magic, no guesswork, no unreliability involved.

What if I replaced all my <div> elements with <main> elements?

Then navigating by <main> landmark would be no more useful than just reading the entire page. It's trivial for authors to make UA features useless. For example, what if the author sets the text colour to the same as the background colour? I really don't see what relevance this has to the argument.

Conformance rules, which as far as I understand is all we're talking about here, only matter to authors who are already a priori trying to do the right thing. So authors who would do such inane things aren't relevant to the discussion.

Again, you have actual constituents for this feature telling you that > 1 main is bad and you refuse to admit you're wrong.

Again, making assertions with no supporting evidence or reasoning is of no consequence here. If I were to say "all Web pages must have exactly three <main> elements", without explaining why, then my argument would be equally useless and thus we wouldn't update the spec.

The arguments that lead to the spec being the way it is are:

The arguments against, as far as I can tell, are:

Are some of the arguments I've presented wrong? Did I miss some arguments against? Please only add comments to this thread if either one of the arguments above is wrong, or if I missed some arguement. Please don't post just to repeat one of the arguments (for or against) that I've already presented in this comment.

LJWatson commented 9 years ago

@Domenic

"Can you explain in more detail, with examples from real ATs, the harm caused be more than one main?"

AT handling isn't the important thing. It's the usability that's important.

ATs can handle multiple <main> elements. They can handle single <main> elements just as easily. If no additional <main> elements are present, the screen reader will announce something like "No further main regions".

The usability of multiple <main> elements is a different matter. It's what I tried to explain before… the concept of "main" is a singular thing to most people. Encountering multiple main regions on a page breaks this expectation, but where a sighted person may be able to differentiate them visually, there are no such cues available to screen reader users.

LJWatson commented 9 years ago

@Hixie

"But that's besides the point. People can make claims all they like, what matters is whether those claims are backed up. So far nobody has actually explained a problem with allowing authors to have multiple parts of the page marked with <main>. Indeed, it seems based on the descriptions of AT behaviour that ATs handle that fine. (Even if they just ignored all but the first, it's hard to see where the harm would be -- it would be just like allowing one, and then saying all the others are nothing but styling hooks.)"

The harm is that content consumers for whom semantics are important expect "main" to be a singular thing. ATs can handle multiple <main> elements, but they can't differentiate *the main region from all the other main regions on the page. So a screen reader user has no mechanism to make the differentiation either.

"Let me ask a different question which maybe would help. Suppose we required that there only be one <main> per page. What do you think ATs should do with the second <main> in a page?"

They will do what they do when they encounter a single instance of any other accessibility supported element: they will indicate the presence of the first instance, and if the page is queried again they will inform the user that no further instances are available.

"What should ATs do with the <main>s found in seamless iframes?"

The same thing they would do if they encounter multiple <main> elements in the parent document. Again, this is not about the AT's capacity to handle multiple main regions, but the semantic usability of doing so.

Heydon commented 9 years ago

@Hixie @domenic

I mean, sure, it's a semantic element and semantic elements are eminently useful for being interpreted by non-mainstream renderers in a useful way, but that's the whole point of having semantic markup in general. It doesn't mean that HTML's main constituents are AT users.

It's not the only population that benefits from the element. Authors also benefit from it, for styling purposes.

The benefit of <main>, and the purpose of any semantic HTML is so that it can be interpreted interoperably. I think we agree on that. Since differentiating elements merely for styling using, for instance, the class attribute is utterly trivial, can we not also agree that the importance of the existence of <main> as a styling hook is rather missing the point?

The element was conceived to fill a semantic hole, in order to assist a class of users who depend on interoperability to achieve actual tasks. Anything else is a side benefit or irrelevant frippery.

stevefaulkner commented 9 years ago

Preface: I am no longer posting to try to convince the whatwg editors to change their definition of main. The fact that we have two disparate definitions is annoying, but not overly problematic in practice as the whatwg definition is largely ignored by authors and therefore does not negatively effect users, neither is it implemented in the authoritative HTML conformance checking tool. The following is merely and attempt to provide information that may be helpful in understanding why main was defined as it is.

It has never been claimed by anyone anywhere that multiple mains break or cannot be handled by AT, so am glad we have cleared that up. The definition and restrictions on usage of main have always been about encouraging authors to use main as it was intended to be used in its orginal and current definition. The aim being to maximise the benefits for users (actual users not authors). However 'sketchy' the ONLY publicly available data indicates that the W3C defintion of main fits with authors understanding as well. There has been no evidence or opinion offered to indicate that multiple mains provide a more usable or understandable UI, all opinions offered here from users indicate the opposite.

Landmarks were designed to provide users with a method to quickly navigate macro regions of a document: landmarks

AT implementations generally provide access to landmarks via a single key stroke, for example R in JAWS. by pressing the R key users move from one landmark to the next, the landmark name is announced (along with any accessible name the element has). When users land on the content area they are interested in they can then drill down into that content via heading, link, form controls or the host of other micro navigation methods available to them. In most AT landmarks are also presented via a dialog:

Example of a JAWS landmark dialog displaying banner, main, complementary and content info jaws-landmark1

Due to the common implementation of landmarks (single keystroke for all landmark types), it is not a big leap conceptually to understand that the greater the number of landmarks, the decrease in the usability of them for navigation of the page, as a page cannot be navigated from top to bottom in a few keystrokes when there are many landmarks. This is reflected in anecdotal reports by users. This has also informed implementation of landmark semantics when mapped to HTML elements. Which is why header/footer only map to landmarks when scoped to the body element (sketchy data indicates that in the majority of documents 1 header/footer are used, but some documents contain large numbers of these elements). It is why section, which was originally mapped to role=region is only announced by AT that support it, when it has an accessible name (as a result of users complaining about its overuse and the consequent semantic noise it created) and it is the reason why main was defined the way it is, in attempt to maximise uitility to users and minimise author use that undermines utility. Further changes to the mappings may well occur in light of the design of landmarks. For example <aside> currently maps to complementary, the usage patterns of the element indicate that some muting of the semantics may be necessary to improve usability.

joe-watkins commented 9 years ago

@Hixie If you have an interface that calls for multiple <main> tags you own a content problem. This is an automatic red flag that further distillation of content priority should occur.. much in the way that if your document consists of 76 <h1> tags.

Thankfully AT and browsers swallow the poor UX semi-gracefully.

@Hixie All I can imagine when you make us visualize a site that has a need for multiple <main> tags I visualize non-performant, ad riddled websites that are soon becoming victims of content blocking, and punished by search engines for poor practice. Do you maintain one of these?

Inconsistent documentation as a result of Rebel Alliance vs. Galactic Empire drama is harmful for developers that lean on these docs for how to write great code and exasperates our already inaccessible web.

As a developer, I’ve always considered WHATWG’s documentation a bit out of touch and really just a step above w3schools. Should we trust a source who’s using deprecated tags themselves? http://www.screencast.com/t/Z1L8LZArH | https://validator.w3.org/nu/?doc=https%3A%2F%2Fwhatwg.org%2F

I thought the <hgroup> tags was deprecated? https://html.spec.whatwg.org/multipage/semantics.html#the-hgroup-element

It’s time you get your docs in order and while you are at it I’m +1 for aligning with the W3C on the <main> tag.

jpv66 commented 9 years ago

I work in accessibility since twenty years, I work with dozen of users, teach thousand of developpers. I totally agree with the views expressed by Steve, Leonie and all those who campaigned for a unique <main>. I would add just one thing. "One more thing" : beyond these incessant debates and decisions you make, whatever they are, there are people like me whose role is to turn specifications into content truly accessible for users and my duty is to provide them the best. More than your bickering on a formal or theoretical definition and philosophical debates over semantics as interesting as they are, we need a unique <main>. simply because it's better for users. A simple feature, natural , clear and easy to use, wow... what a dream ! I don't even need to analyze data, I don't need to know how developers perceive or use this element, I don't, especially, need to know whether this represents some sort of theoretical vision . It was proposed a unique <main>. to solve an important usability problem. Problem solved. By choosing an intrangisant position based on theory you are wrong fight. The only thing you get is to make our task even more complicated, and beyond for users for which we try to make things a little easier. Yes, this is not a theoretical, scientific or statistic argument :) .

Basically its not so serious, main will be unique because users need it.

aardrian commented 9 years ago

Assuming you don't want to read my point-by-point response, you can just skip to the end of my comment for a question about the next step.

Now that you've laid out arguments why the spec is what it is, it's easier to address directly:

  • UAs already support multiple <main> elements in a usable and useful fashion, same as <div>.

You are making an assertion about "usable" and "useful" with no evidence to back it up.

UAs historically support lots of things in a usable and useful fashion -- for developers. Which is why we still find <font> and <center> in the wild. UAs graciously accepting bad code is not the same as implementing a spec properly else we'd be back to XHTML2. It is not for you to assert whether it is usable for a user.

  • ATs already support multiple <main> elements in a usable and useful fashion, according to AT users in this very thread who described the actual behaviour: it is treated as a normal landmark role, in document order. This is a mechanism with which AT users are presumably comfortable since it is used for the other landmark roles and general document navigation.

Two comments above dispute your assertion, suggesting only knowledge of the code on the page itself can mitigate confusion:

According to @bramd:

"The blog archive example that has been named earlier in this thread would actually cause me to think I landed on a single article page after jumping to the first main element and landing at the start of a blog post."

And from @BimEgan:

"So multiple main elements could mean that screen reader users can't be really sure where they are on the page and may miss the most recent article (at the beginning of main content) without knowing that they've done so."

  • There are many scenarios (e.g. an article with inline sidebars and advertising, a blog post with multiple articles, etc) where there are multiple disjoint parts of the page that consist of its primary content.

Except elements like <aside>, <section> and <article> already do that in a semantically better and more usable way.

In the world of site building, when the stakeholder says the most important thing should be font-and-center on the page, and then tells us that 10 items are equally the most important, we get carousels. Let's not allow us to codify its analogue.

  • Being able to style the primary content of a page or <article> is (mildly) useful. Since it can be disjoint, this requires multiple <main> elements. (This use case isn't enough to justify introducing an element, but since it's already implemented it's enough to justify specifying it as conforming.)

Your example also does not require multiple <main> elements for styling. You are just trying to re-state your third point; it is not a distinct argument.

  • Being able to jump to primary content on a page or <article> is useful. This can be achieved in a variety of ways, e.g. by skipping headers and other content marked as non-primary using the semantic elements in HTML, but it can also be achieved by having an element (such as <main>, which happens to already be implemented for this purpose in ATs) that highlights the primary content. Since the primary content can be disjoint, this latter approach requires being able to use multiple <main> elements.

This is a restatement of your points # 2 and # 3, which means it should be dismissed out of hand. Except here you are trotting out the Scooby-Doo algorithm, which has already been dismissed (hence <main>).

  • According to the ARIA spec and RFC2119, there may exist valid reasons in particular circumstances when it is acceptable or even useful for ARIA's "main" landmark role, on which <main> relies for its AT mapping, to be included in a document multiple times.

For those reading at home, RFC 2119 (https://www.ietf.org/rfc/rfc2119.txt) defines the keywords MUST, REQUIRED, SHOULD, etc. ARIA 1.0 and 1.1 say there SHOULD be no more than one main landmark role per page (http://www.w3.org/TR/wai-aria/roles#main, http://www.w3.org/TR/wai-aria-1.1/#main). Except we're talking about an element that manifests that role. As such, its implementation can go beyond what ARIA defines, enabling authors to create a better experience by following just the HTML spec without needing to call the ARIA landmark roles.

  • Since documents can be nested, ATs and UAs are in any case required to support the situation where multiple <main> elements can be nested.

And yet, those would each apply to one document, making it even more important that <main> is restricted to only once per, especially in consideration of my response to point # 2 above.

Your arguments against:

  • It's confusing to AT users. No particular backing for this claim has been presented, however. It's just been stated as fact. This seems to be contradicted by the descriptions of actual AT behaviour.

See the two examples I cited above. Actual AT behavior does not contradict those statements at all, if anything it shows a deficiency in how AT handles multiples.

  • It's contrary to the meaning of "main". This is irrelevant, we have plenty of elements whose semantics bear little relationship to their name (e.g. <address>, <cite>, <b>). It's also debatable, as described above -- primary, or "main", content on a page can be disjoint.

I think you need to consider that content authors have a meaning of the word in their head that doesn't match yours, and since this is a new element we have the opportunity to not lie about its name for once.

  • Certain classes of people don't want this behaviour. This argument is an appeal to consensus, which is contrary to the documented WHATWG values, and therefore irrelevant.

The opposite of that is appeal to authority, which is where I think nearly everyone feels we stand now based on the arguments. I would hope that isn't in the documented WHATWG values (for which I cannot find a link).

  • Not many pages use multiple <main> elements. This is based on very sketchy data (less than 300 pages that use the element at all) and generally speaking isn't a reason to disallow a usage.

Since you have already stated that you will discount any data that is not "in the billions," and since you know a data-set of this size is beyond the means of anybody other than someone on the scale of Google to gather, you are essentially tying anyone's hands on bringing data to bear.

  • "What if people misuse <main> in a harmful fashion?" This applies to the element regardless of the conformance rules and therefore isn't a relevant argument.

I tend to agree, but being more explicit in its documentation can help mitigate this.

  • "You (me, @Hixie) are personally against helping people with special accessibility needs." This is a patently absurd ad hominem contradicted by years of work, and, more to the point, is not an argument for or against anything to do with the spec.

I agree it's not relevant to the discussion (nor do I believe you are personally against helping people with special accessibility needs).

  • "There should be only one element". This is an argument by assertion and therefore doesn't actually help make a decision. It might be true or it might be false, there's no way, from just that argument, to know.

The assertion is supported by the original intent, feedback from professionals, current use, validation rules, the W3C specification, educational materials, and anecdotal feedback. If there was no assertion, there'd be nothing to test.

Put it all together, and I believe this is where we stand:

Given that, assuming you disagree with everything I've argued above, have we reached an impasse or is there some other scenario that you would find sufficiently compelling to re-examine the element?

foolip commented 9 years ago

There is one thing that I would like clarified. In those ATs that have a shortcut for jumping to <main> (or the landmark it maps to), can a user tell after using the shortcut whether there are more main sections? It seems to me that if there's almost always a single main section, then that would be an incentive for ATs to not bother announcing "no more main sections" or similar, and users would also not have much reason to try the shortcut again just to make sure, if there very seldom is another section.

aardrian commented 9 years ago

@foolip, no. ATs supporting region navigation do not communicate to users whether there is more than one of a particular region. As such, there is no incentive for users to repeatedly activate the landmark navigation shortcut.

Hixie commented 9 years ago

@LjWatson @jpv66 Your most recent comment does not include any information that was not already stated on this bug. If you wish to contribute to WHATWG discussions, please be considerate of our culture, which specifically discourages repetition of already-stated points and statements of opinion without solid arguments or data.

@Heydon Styling is a big part of how people create documents on the Web. It's not besides the point, it's one of the main points. If it wasn't for styling, I think we would see even less usage of <main> than we do now. In general, the way to get accessible documents is to design features where using the feature for its non-accessible purpose in the most obvious way is coincidentally also the way to use it to improve accessibility. Features that are exclusively for accessibility tend to fail to achieve their purpose at scale because, unfortunately, most authors never test for accessibility and untested code is almost always wrong. This is why, for instance, longdesc="" and summary="" failed on the Web; it's also why alt="" has had a poor showing but a slightly better one (it has a secondary purpose for people who have images disabled -- I wouldn't be surprised if this got worse over time since images being disabled is no longer a thing, especially on mobile).

The element was added to the spec because it was implemented. It wasn't added to serve any specific purpose, the purpose was retrofitted on the implementations. (There was discussion of purpose prior to the implementations, but, at least at the WHATWG, those discussions concluded with the decision that the arguments were not sufficiently convincing to be worth adding the element, so they are not relevant.)

@stevefaulkner If you're not posting to achieve a change to the spec, please do not post at all. The only point of these bug trackers is to discuss changing the spec.

@joe-watkins We have to design for the Web we have, not the Web we wish we had. The reality of the Web is that pages have inline advertisements, asides, and so forth. Even the spec itself (which has far more than 76 <h1>s) has asides scattered all along it. Now I wouldn't think that <main> would make any sense for the spec, but if one were to use it, I don't see how one could use only one.

@aardrian That non-AT UAs support <main> usably is a trivial statement; they support it the same way as any element, namely, it's styleable. If that's not usable or useful then we have much bigger problems. By usable here I mean for authors. Users of non-AT UAs can't tell what element was used.

For AT UAs:

"The blog archive example that has been named earlier in this thread would actually cause me to think I landed on a single article page after jumping to the first main element and landing at the start of a blog post."

I've always felt that's a risk (even without <main>). How would you avoid it with a single <main>?

"So multiple main elements could mean that screen reader users can't be really sure where they are on the page and may miss the most recent article (at the beginning of main content) without knowing that they've done so."

I don't really see how a single <main> helps with this either. It just means there's a single spot on the page they can jump to (which they could do by jumping to the top of the page then jumping to the first <main>, in the multi-main case). How do you accidentally jump past the first one? Does the same problem exist with <h1>? Should we limit pages to only one heading for the same reason?

You say that "elements like <aside>, <section> and <article> already do that in a semantically better and more usable way", which is what I've been saying for a long time about <main> in general. This is an argument for removing <main> entirely, not for limiting it to only one instance per page.

You say we shouldn't codify analogues to what authors are doing (carousels, etc), but why not? That's exactly what we should be codifying! If we don't codify what people are doing, then we'll get a language that isn't relevant for the real world.

You are just trying to re-state your third point; it is not a distinct argument.

Sorry if it wasn't clear. My points were not peers, they were building a case. For example, the fourth bullet relies on the third bullet; without the third bullet, the fourth bullet would be without basis. The third bullet isn't an argument for multiple-<main>, it's just presenting a basis for further arguments.

And yet, those would each apply to one document, making it even more important that <main> is restricted to only once per, especially in consideration of my response to point # 2 above.

I don't understand what you're trying to say here. With seamless documents, there's not supposed to be any AT-exposed seam between nested documents.

I think you need to consider that content authors have a meaning of the word in their head that doesn't match yours, and since this is a new element we have the opportunity to not lie about its name for once.

There are many authors. They don't all agree on this point (for example, I'm an author, and I don't agree).

The opposite of [appeal to consensus] is appeal to authority

No, that's a false dilemma. The WHATWG process is documented on our FAQ, I urge you to read it.

Since you have already stated that you will discount any data that is not "in the billions," and since you know a data-set of this size is beyond the means of anybody other than someone on the scale of Google to gather, you are essentially tying anyone's hands on bringing data to bear.

One has to be realistic about what a useful data set is. That getting a useful data set is something that requires more than just a laptop and a few hours of Web crawling is unfortunate, but it's the truth.

FWIW, I've done research on other people's behalf before. If you present me with a data mining exercise that I believe has the possibility to bring new and actionable data to the table, then I would be happy to spend Google's resources obtaining that data. I'm sure people at Microsoft, Amazon, Yahoo, and other companies that have similar resources would be willing to help as well. The problem is that this requires describing a rigorous and well-defined study, including how to interpret the results. It's not at all clear to me what data one would collect here that would reasonably be used as evidence one way or the other.

The assertion is supported by the original intent

The original decision, at the WHATWG, was that the element was not needed; once it was implemented, the decision was just to retrofit the most useful purpose on the element.

feedback from professionals

On its own, this is an appeal to authority.

current use

It's not clear that this is actually backing the one-main argument, as discussed earlier.

validation rules

Not sure how this is an argument either way.

the W3C specification

I think you'll find that appeal to the W3C is not an effective tool here. Given how often the W3C has lied to us, betrayed our agreements, violated the spirit of cooperation, copied our work against our will, made changes to the spec that are laughably incompetently executed, and generally acted as a highly offensive and incompetent entity, I, at least, have to actively work to avoid having a bias that comes from an emotional reaction to the W3C.

educational materials

Not sure what you mean here. Can you elaborate? Would writing tutorials that encourage multiple-main be an argument in favour of multiple-main?

and anecdotal feedback.

I believe all the anecdotal feedback has been discussed above. Is there more?

You find consensus from the original spec author, users, and professionals on this bug report insufficient.

Consensus in general is not taken into account in decisions that affect the spec, right.

You will not find any data-set demonstrating single instances of <main> to be sufficiently massive.

On the contrary, I think it's quite possible for us to have sufficiently large datasets. I admit though that I can't really think of a study that would help determine the answer here. (In either direction. Suppose we found that all pages on the Web had two <main> elements. Would that tell us that we should make <main> allowed multiple times? Or would that tell us that things are really dire and we really need a conformance rule against it?)

You do not accept anecdotal negative experiences from real users (even as a possible trend that can be averted).

I haven't been convinced by what I've heard so far (and certainly I don't think there's a trend -- the feedback has been more mixed than you imply). That does not mean that I couldn't be convinced otherwise. In the past we've had people demonstrate this kind of thing with videos showing their experience browsing the Web that have been quite helpful in coming to a decision, for example. I could definitely imagine someone showing their experience on a common (non-artificial) Web page, such as a page on CNN, modified several times to show a variety of ways that <main> could be used, and demonstrating why one or another model is obviously more usable than another. (I should hasten to add that in the past, such videos have sometimes convinced me of things that are quite different than what the person showing the video thought they would convince me of.)

You believe that the primary content of a page is not singular or can span multiple regions.

This isn't a belief taken on faith. It's demonstrably true. Many news providers, for example, intermix "related content" with their articles. Blogs have multiple blog posts mixed with advertising. Even the HTML spec has "in flow" content mixed with "aside" content like examples and notes.

You believe the ARIA role implicit in the element overrides the element's intent.

I think the ARIA role's definition is an interesting data-point. I don't think it's compelling enough one way or the other.

From an accessibility perspective (only), <main> and <div role=main> seem to be equivalent. As such, since we allow multiple <div role=main>s, why not allow multiple <main>s? I don't see a good answer to that, which is why this, to me, is a (weak) argument in favour of multiple-<main>. But this is mostly an appeal to authority, I'll grant you.

Given that, assuming you disagree with everything I've argued above, have we reached an impasse or is there some other scenario that you would find sufficiently compelling to re-examine the element?

I hope I've described some above. I could always be convinced of something. Even if we changed the spec to only allow one <main>, I could always be convinced to go back to multiple-main one day. As more data is added to the discussion, the weight of evidence one would need to change the decision grows, since you have to outweigh all the arguments in the other direction first. Currently, I see pretty much no reason to disallow multiple-main, and lots of weak reasons to allow it. Some strong reasons to disallow it would be convincing. A large number of weak reasons would be convincing.

@jpv66 Assuming you really mean that you've been working in accessibility for 80 years, I would like to welcome you to our community. I'm sure that makes you the oldest and most experienced contributor we have.

metzessible commented 9 years ago

I usually try to avoid getting involved in arguments between WHATWG and W3C supporters, because there's typically too much baggage to sort through before getting to the meat of the disagreement. Indeed, even in an argument about whether or not there is value in supporting multiple <main> elements in a page, there seems to be existing issues dredged up from those in support of, and those in opposition of, the present decision.

Therefore, I'm not going to speak about:

Instead, I'm going to suggest the notion that the way the specification is written, the second note for the Main element actually contradicts the normative statements previously mentioned. By example of comparison, the W3C documentation does have a more detailed explanation of the element itself, which I do believe supports the argument a bit more concretely that there should only be one instance of this element within the markup. The lack of explanation is what's causing some confusion surrounding how to handle multiple <main> elements within WHATWG's documentation.

Despite this lack of detail, the <main> element is described as "a container for the dominant contents of another element." Explicitly stated, it is an element that — once chosen to be implemented — it provides intrinsic meaning to it's children. However, this is element is distinct from sectioning content in that it provides no contribution to an outline. This is important because the <section> element is explicitly appropriate when generating an outline for "thematic grouping of content." Web authors wishing to style content are encouraged to use a <div> instead.

Therefore, in considering the usage of multiple <main>s on a page, one would need to assume the purpose of the element is either to style (incorrect usage), or to provide instances for other sections (incorrect markup). The specification written for <main> is not intended to provide styling or for sectioning, yet it could be construed as such given the second note's lack of clarification. Instead, <main> has the semantic purpose of providing the relevant content on a page. For purposes of single-page websites, templates, et al, there are already available elements to provide context for those separate sections where their purpose would be semantically appropriate because they provide an outline for which they are best served.

To parse @Hixie's visual representation of streets with my explanation:

Group of houses labeled with text describing their respective street as "W3C Street <main> Entrance

This is semantically appropriate because it details the location of grouping elements (houses) that exist within the page. In a real-world example, the <main>-street element is the location of where the mail gets sent to, but it requires a semantically defined outline of which group actually receives mail. In this case, the following markup for (my new web-based Utopia) Standardistan would look like the following:

  <article>
    <house>
      <section>
       <H1> House Number 1 </H1>
      <aside>
         <div> window 1 </div>
         <div> window 2 </div>
      </aside>
      </section>
  </article>
   <article>
    <house>
      <section>
       <H1> House Number 2 </H1>
      <aside>
         <div> window 1 </div>
         <div> window 2 </div>
      </aside>
      </section>
  </article>
  <article>
    <house>
      <section>
       <H1> House Number 3 </H1>
      <aside>
         <div> window 1 </div>
         <div> window 2 </div>
      </aside>
      </section>
  </article>
</main>

In this regard, your second image is incorrect, because the purpose of the document is not to differentiate between the separate dominant content existing on <main> street, but rather to explain where exactly mail gets delivered. In other words, the image could be described as, "This is the dominant location of Standardistan. Within its borders there exists a group of 3 houses. Each house has 2 windows and a door, of which there is an entrance. The entrance to the content of each house is represented by a Heading, easily conceptualized based on it's ranking within the dominant location." The doors might be considered to be main entrances (as opposed to their back patios perhaps), but accessing these doors is based upon how they have registered in the DOM (e.g., the map to our town) according to the outline that has been presented.

My explanation for why "a page with multiple article elements might need to indicate the dominant contents of each such element" (as written in the second note) is semantically inappropriate is that any element that needs to indicate another purpose should be able to be programmatically determined. So far, outlines seem to be the best way to achieve intrinsic meaning within HTML5, so providing context to those elements should trump the various arguments for using a different way to observe dominant content.

If I were to say, "Look at the second house," the method of performing this action wouldn't be to find the main entrance of a given home, yet rather to observe where in the overall scheme of a specific domicile location it resides and define it from the characteristic of it's logical reference of association to other elements within a given location. Given this example, the concept of outlines is how to distinguish the target home, yet the concept of <main> provides where to begin that observation in the first place.

I'm not arguing that you should use only one <main> on a page because it's better for the end user, or that there is sufficient data to back up how one perceives relevant information on a page; but rather because the supporting specifications according to either WHATWG and W3C both explain how there is already a preferred methodology for presenting information and relationships within properly semantic markup. The <main> element is used to describe the overall dominant information, yet the methods of programmatically determining similarly relevant, yet separate information is accomplished through a separate preferred technique consisting of sectioning and headings. Using <main> in such a manner is incorrect according to WHATWG and W3C supporting normative specifications.

mpnkhan commented 9 years ago

As a developer we were using <div id="main"> as a container which wraps all the elements of the page. After <main> was introduced, conveniently replaced this main div with <main> . Never had the necessity to include multiple <div id='Main'> . When i came to know that it additionally helps Screen reader users, just go for single <main> . Single <main> = less confusion.

foolip commented 9 years ago

@aardrian, is that true even of landmarks where there's usually more than one, like headings or menu items?

aardrian commented 9 years ago

@foolip, yes.

@Hixie, this is my best effort to answer your direct questions and address some of your points. Apologies for the length but there is a lot to unpack.

I've always felt that's a risk (even without <main>). How would you avoid it with a single <main>?

Sadly, you cannot. Just as you cannot prevent authors from using multiple IDs of the same value, etc. Though validation errors can at least raise the issue.

But if you agree that's a risk, then let's clean up the language to reduce that risk.

I don't really see how a single <main> helps with this either. [...] How do you accidentally jump past the first one? Does the same problem exist with <h1>? Should we limit pages to only one heading for the same reason?

To answer each of the three questions: The issue isn't accidentally jumping past the first one, but missing subsequent <main>s because a user may jump to another landmark. The same problem does not exist with <h1> because it is not a landmark. There is no same reason, so the question doesn't even apply.

You say that "elements like <aside>, <section> and <article> already do that in a semantically better and more usable way", which is what I've been saying for a long time about <main> in general. This is an argument for removing <main> entirely, not for limiting it to only one instance per page.

Except those elements do not provide the same landmark navigation, its primary use case for inclusion and what makes it stand apart from the others.

You say we shouldn't codify analogues to what authors are doing (carousels, etc), but why not?

Because we have analysis that tells us that pattern is counter-intuitive and lowers engagement, confuses users, and generally doesn't work well.

There are many authors. They don't all agree on this point (for example, I'm an author, and I don't agree).

My point is that you have to understand that the very word is understood differently by people who are not you.

No, that's a false dilemma. The WHATWG process is documented on our FAQ, I urge you to read it.

I did read it. I saw no "values." I also disagree with your logical conclusion.

If you present me with a data mining exercise that I believe has the possibility to bring new and actionable data to the table, then I would be happy to spend Google's resources obtaining that data.

I would love for Google to provide a method to allow us to search its entire index for HTML patterns. While that's outside of the scope of this issue report, I think that would be a great feature.

The original decision, at the WHATWG, was that the element was not needed; once it was implemented, the decision was just to retrofit the most useful purpose on the element.

And that's what this issue is trying to adjust. Hence, the assertion supported by the original intent. The more practical decision would have been to just align with the spec from W3C as the element's purpose had already been determined.

You've dismissed the expertise and recommendations of those on the ground as an appeal to authority. This is problematic because they are in the best position to understand its impact, particularly after recommending it be added.

I, at least, have to actively work to avoid having a bias that comes from an emotional reaction to the W3C.

The paragraph I trimmed suggests that's not the case. If it were, you'd acknowledge that it is a standards body and that developers do listen to it. As such, this spec promotes confusion.

Can you elaborate? Would writing tutorials that encourage multiple-main be an argument in favour of multiple-main?

Low-hanging fruit (because I have no books, online courses, etc. handy to check): https://www.google.com/?gws_rd=ssl#q=html5+main+element Tutorials and explainers say to use no more than one. For your second question, sure, it could be a part of an argument, though it would be contending with existing material specifying the contrary.

Suppose we found that all pages on the Web had two <main> elements. Would that tell us that we should make <main> allowed multiple times?

It would tell us that we need to revisit the W3C spec to be more explicit, that we need better training materials, and also to work with ATs to better surface that information.

Or would that tell us that things are really dire and we really need a conformance rule against it?

Nope.

In the past we've had people demonstrate this kind of thing with videos showing their experience browsing the Web that have been quite helpful in coming to a decision, for example.

Would you be willing to bring Google's resources to bear on making a video testing lab available (for this and other questions)?

Even the HTML spec has "in flow" content mixed with "aside" content like examples and notes.

Which would live within a <main>. None of your examples warrant multiple <main>s. Doing so is an author preference (that I would discourage), not one guided by some clear rule.

As such, since we allow multiple <div role=main>s, why not allow multiple <main>s? I don't see a good answer to that, which is why this, to me, is a (weak) argument in favour of multiple-<main>.

I stated it already: "As such, its implementation can go beyond what ARIA defines, enabling authors to create a better experience by following just the HTML spec without needing to call the ARIA landmark roles."

karlgroves commented 9 years ago

@Hixie says

If you present me with a data mining exercise that I believe has the possibility to bring new and actionable data to the table,

Actually @stevefaulkner already did this, very early in this conversation. He did so using data publicly available from WebDevData using a statistically significant data set with a very low margin of error. You can, if you so choose to, replicate his investigation using that same open data to verify or dispute his results.

Unless there's any dispute as to the accuracy or completeness of his claim, it would appear that he has provided you with "new and actionable data". That data says, pretty clearly, that authors understand that intent for the <main> element is that there should be one and only <main>.

In short, you have 22/24 participants in this conversation in agreement that there be one and only <main> and you have data demonstrating that there is "... >97% of usage of the <main> element is as per the W3C definition..."

Given the above, when can we expect the WHATWG spec to be brought in line with W3C spec WRT the <main> and <body> elements as suggested by Steve Faulkner?

foolip commented 9 years ago

The httparchive data set is pretty big if anyone wants to do research. I looked in the 20150101 data for case-insensitive matches for <main and then counted the number of <main> elements using this script:

#!/usr/bin/env python3

import html5lib
import sys

f = open(sys.argv[1])
tree = html5lib.parse(f)
mains = [e for e in tree.iter() if e.tag == '{http://www.w3.org/1999/xhtml}main']
print(len(mains))

Here are the ones with more than one <main>, if anyone wants to analyze why this might happen:

10 http://www.litmusbranding.com/
9 http://www.51offer.com/
8 http://www.tenditrendy.com/
8 http://www.homedit.com/
8 http://www.fingerprintexpert.in/
7 http://www.kgbr.co.kr/flash/menuList.php
5 http://www.watch-next.com/
5 http://www.studiocalico.com/
4 http://www.visitnewportbeach.com/
4 http://www.suckhoenhi.vn/
4 http://www.longtailpro.com/
4 http://www.hlf.org.uk/
4 http://hsmeducacaoexecutiva.com.br/
4 http://family.disney.com/
4 http://beta.captora.com/views/scoreCard/scoreCard.html
3 http://www.tubepornparty.com/
3 http://www.tryxxxtube.com/
3 http://www.tryporntube.com/
3 http://www.topporntubes.com/
3 http://www.thenewslens.com/
3 http://www.themosis.com/
3 http://www.maturevideos.xxx/
3 http://www.linkfinance.fr/
3 http://www.kgbr.co.kr/flash/quick_menu.php
3 http://www.gpwiki.org/
3 http://www.aish.com/
3 http://www.african-porno.com/
2 http://www.ultimate-bravery.com/
2 http://www.tribuna.ru/
2 http://www.timocom.de/
2 http://www.timocom.com/
2 http://www.themosis.com/en/
2 http://www.thejoysofboys.com/
2 http://www.rhein-neckar-loewen.de/
2 http://www.pnxsoft.com/
2 http://www.plumperporn.xxx/
2 http://www.pinoytravelblog.com/
2 http://www.payproglobal.com/
2 http://www.nou-pou.gr/
2 http://www.mtk.ru/
2 http://www.moneyguru.com.br/
2 http://www.matito.ru/
2 http://www.lifespa.com/
2 http://www.krisaquino.net/
2 http://www.isavea2z.com/
2 http://www.ioucentral.com/
2 http://www.india-topsites.com/
2 http://www.hmsa.com/
2 http://www.frugalfanatic.com/
2 http://www.french101.me/
2 http://www.freeanal.xxx/
2 http://www.f-e.tw/
2 http://www.dailysquib.co.uk/
2 http://www.couponsaregreat.net/
2 http://www.cosmopolitan.de/
2 http://www.christiankonline.com/
2 http://www.campusinsiders.com/
2 http://www.byui.edu/
2 http://www.blockstream.com/
2 http://www.bergbahn-kitzbuehel.at/
2 http://www.bbwfuck.xxx/
2 http://www.bankbii.co/whitecard/
2 http://www.affenblog.de/
2 https://www.sensus-capital.com/en/

Warning: Lots of porn. Also, some of these sites have changed since they were crawled.

I haven't gone through all of these sites, but many seem to be using <main> elements in crazy ways. However, http://tribuna.ru and http://www.thenewslens.com/ actually look quite sane, with what seems to be the main content marked up as such.

It would be quite the undertaking to analyze a random selection of <main>-using sites to figure out what conclusions to draw.

stevefaulkner commented 9 years ago

@foolip What was the total number of pages with main?

foolip commented 9 years ago

There were 9019 resources (from 8770 pages) that I passed through the Python script, based on an initial grep and removing things that didn't have Content-Type text/html. There are 10139623 resources (don't know how many pages, but much fewer) in the 20150101 httparchive data I have.

Suffice to say, <main> itself is uncommon and multiple <main> elements much less common still.

stevefaulkner commented 9 years ago

Suffice to say, <main> itself is uncommon and multiple <main> elements much less common still.

@foolip, thanks. I did a full grep of latest http://webdevdata.org and found 1732 pages with <main> of those 1720 = 1 <main> = 98.7%, and from what you have reported the results for the httparchive data is very similar.

I don't think it should suprise anybody that the usage of <main> is low as it has only been around for a few years.

foolip commented 9 years ago

@stevefaulkner, I don't think this number can really inform the decision, other than to tell us that almost nobody will be affected by a change. A more worthwhile exercise, IMHO, would be to analyze the tiny subset where people are using multiple <main> elements, to see if there are reasonable cases among them.

stevefaulkner commented 9 years ago

@foolip I would think that how developers use a feature reveals a cowpath, there must be some reason why 99% of developers who use it, use 1 main per page and it would be an important consideration in defining the feature, but I am obviously missing some understanding of feature design.

metzessible commented 9 years ago

I don¹t know if it¹s important, but it might be worth noting that [at least for] the adult websites that are on the list from the httparchive data set are actually mirrors of the same site. So for much of the data that¹s listed, they¹re going to be relatively low quality in terms of a source. Not sure if that¹s helpful to this conversation.

foolip commented 9 years ago

@stevefaulkner, maybe a single <main> is enough in 99% of cases but multiple <main> elements is a great idea in the remaining 1%. Or maybe the remaining 1% is all incorrect and confused. Maybe the same kinds of mistakes are made regardless of the number of <main> elements in use. All of these possibilities are consistent with the data, so I'm not sure what to make of it.

A full analysis of the URLs listed would be very tedious, but I would be interested to know if e.g. http://tribuna.ru and http://www.thenewslens.com/ look reasonable.

stevefaulkner commented 9 years ago

@foolip I had a look at the majority of the pages in the list you provided. It appears that usage falls into a few buckets.

  1. nested main without content between: this appears to serve no purpose http://www.thejoysofboys.com/ and http://www.isavea2z.com/ `
    `
  2. multiple mains acting as macro containers for content `

or http://www.themosis.com/ this one uses it like article or section

  1. Then there examples such a the following where main is used being used wrong under anybodies definition (although it would lead to a conformance checker flagging an error only under w3c definition): multiple <main>'s http://www.litmusbranding.com cprlpcow8aajv9a

The main sticking point in this discussion is that there are fundemental differences between the semantics of main between whatwg (can be used to markup both macro and micro content pieces) and w3c (can be used to markup a macro content piece) and from the comments above (dominant message being - main serves no useful purpose except as styling hook and is only in whatwg spec because it has been implemented) it appears to be no way to reconcile this.

whatwg

The main element can be used as a container for the dominant contents of another element. It represents its children.

w3c

The main element represents the main content of the body of a document or application.

foolip commented 9 years ago

Thanks @stevefaulkner, that's very helpful.

The usage on http://www.litmusbranding.com/ is bad indeed, it's apparently only used for styling, and doesn't correspond to the "dominant content" even locally. It would be bad even with just a single <main> somewhere in the that footer.

It seems to me that "nested main without content between" (example?) would indeed be better off merged, that "multiple mains acting as macro containers for content" is reasonable if the main content really is discontiguous, and that cases like litmusbranding.com would be better off with no <main> element at all, or one <main> element elsewhere in the page.

Do you have any hunch about how much content falls into each bucket? If you made lists of URLs that would be nice, but I know it's a lot to ask.

domenic commented 9 years ago

I think "just as a styling hook" is not good, and maybe the current definition skews too much that way (although I think the W3C version doesn't understand how "represents" is supposed to work). But I still see no argument for restricting to one per page.

Also, to address something from earlier: I think cowpaths are a useful guide when designing new features. They are not useful however in designing restrictions. Saying "most people do it this way, so we should disallow anyone from using the feature in a different way" is not a good way of designing standards. Cowpaths arguments should be instead "most people are doing this, so we should design a feature to support them." It's OK if the feature ends up being more general and supporting more use cases at the conclusion of the resulting design process.

Instead, restrictions should be imposed based on arguments demonstrating concrete harm, which still haven't been presented.

stevefaulkner commented 9 years ago

@domenic

although I think the W3C version doesn't understand how "represents" is supposed to work

can you explain?

stevefaulkner commented 9 years ago

Instead, restrictions should be imposed based on arguments demonstrating concrete harm, which still haven't been presented.

It is difficult (which is why I am no longer arguing for change in this fora) when users (actual users who get this stuff as UI) provide their input explaining why main as currently used , is a) useful to them and b) fits their mental model of what the semantics of main means. (i.e macro vs micro) as part of the landmark structure, but their input is dismissed.

stevefaulkner commented 9 years ago

@foolip, here is my notes on the majority of the URLS you provided,

On 24 September 2015 at 13:48, Philip Jägenstedt notifications@github.com wrote:

The httparchive data set is pretty big if anyone wants to do research. I looked in the 20150101 data for case-insensitive matches for <main and then counted the number of <main> elements using this script:

!/usr/bin/env python3

import html5libimport sys

f = open(sys.argv[1]) tree = html5lib.parse(f) mains = [e for e in tree.iter() if e.tag == '{http://www.w3.org/1999/xhtml}main']print(len(mains))

Here are the ones with more than one <main>, if anyone wants to analyze why this might happen:

10 http://www.litmusbranding.com/ 9 http://www.51offer.com/

both appear to be misuse

8 http://www.tenditrendy.com/

no <main>

8 http://www.homedit.com/

uses as both macro and micro container

8 http://www.fingerprintexpert.in/

no <main>

7 http://www.kgbr.co.kr/flash/menuList.php

no visible page content, looking at code uses this:

<main_menu>
<main target="_self" link="./company.php" label="한국복음서원">
<main target="_self" link="./ebook.php" label="E-book 카페">
<main target="_self" link="./truth100.php" label="진리와 생명">
<main target="_self" link="./manna.php" label="만나">
<main target="_self" link="./mediazone.php" label="미디어존">
<main target="_self" link="./shareroom_my_one_message.php" label="내가누린한문장">
<main target="_blank" link="http://mall.kgbr.co.kr/" label="복음서원MALL"> </
main>
</main_menu>

5 http://www.watch-next.com/

uses one <main> only, uses it as container for main content for document..

5 http://www.studiocalico.com/

uses one <main> only, uses it as container for main content for document..

4 http://www.visitnewportbeach.com/

uses it to mark up part of content in a section

4 http://www.suckhoenhi.vn/

uses one <main> only, uses it as container for main content for document..

4 http://www.longtailpro.com/

no <main>

4 http://www.hlf.org.uk/

uses it as child of <li>'s as container for a <figure> element

4 http://hsmeducacaoexecutiva.com.br/

no <main>

4 http://family.disney.com/

with parent section , <main> used as a container for a child section element

4 http://beta.captora.com/views/scoreCard/scoreCard.html

used as header/main/footer

3 http://www.tubepornparty.com/ 3 http://www.tryxxxtube.com/ 3 http://www.tryporntube.com/ 3 http://www.topporntubes.com/ 3 http://www.maturevideos.xxx/ 3 http://www.african-porno.com/ 2 http://www.plumperporn.xxx/ 2 http://www.freeanal.xxx/

2 http://www.bbwfuck.xxx/

used to mark up 2 discontinuous areas of main content and footer/nav content (all use same template)

3 http://www.thenewslens.com/ 3 http://www.themosis.com/

already mentioned

http://www.maturevideos.xxx/ 3 http://www.linkfinance.fr/

used to mark up 3 discontinuous areas of main content

3 http://www.kgbr.co.kr/flash/quick_menu.php

duplicate URL

3 http://www.gpwiki.org/

suggest gratuitous use:

<main role="main" class="row gutters">
        <article class="col span_12">
            <p>Welcome to the Game Programming Wiki - A community driven
resource for everything related to game programming.</p>
        </article>
    </main>

3 http://www.aish.com/

used to mark up 3 visibly continuous areas of main content

http://www.african-porno.com 2 http://www.ultimate-bravery.com/

used to mark up 2 discontinuous areas of main content

2 http://www.tribuna.ru/

used to mark up 2 visibly continuous areas of main content

2 http://www.timocom.de/ 2 http://www.timocom.de/http://www.timocom.com/

marks up 1 main content area and 1 link list inside main content area

<main>
                    <ul class="clearfix">

                            <li>
                                <span class="nowrap"><i
class="icon-chevron-right highlight"></i>&nbsp;<a title="Transport Europa"
href="/?lexicon=1109070913400622|Transport
Europa|Transportlexikon">Transport Europa</a></span>
                            </li>

                            <li>
                                <span class="nowrap"><i
class="icon-chevron-right highlight"></i>&nbsp;<a title="TC eBid®"
href="/?lexicon=1008171048233546|TC eBid®|Transportlexikon">TC
eBid&reg;</a></span>
                            </li>

                            <li>
                                <span class="nowrap"><i
class="icon-chevron-right highlight"></i>&nbsp;<a title="Hubwagen / Ameise"
href="/?lexicon=1301281146359213|Hubwagen /
Ameise|Transportlexikon">Hubwagen / Ameise</a></span>
                            </li>

                            <li>
                                <span class="nowrap"><i
class="icon-chevron-right highlight"></i>&nbsp;<a title="Autotransport"
href="/?lexicon=1103141353499382|Autotransport|Transportlexikon">Autotransport</a></span>
                            </li>

                            <li>
                                <span class="nowrap"><i
class="icon-chevron-right highlight"></i>&nbsp;<a title="Laderaumbörse"
href="/?lexicon=807020858411925|Laderaumbörse|Transportlexikon">Laderaumbörse</a></span>
                            </li>

                    </ul>
                </main>

2 http://www.themosis.com/en/ 2 http://www.thejoysofboys.com/

already mentioned

2 http://www.rhein-neckar-loewen.de/

nested main:

<main id="content" class="main">
<div class="layout-wrapper">
<div class="layout-grid">
<div class="layout-content">
<main id="content" class="main">

2 http://www.pnxsoft.com/

1 main for main content 1 main for footer nav list

2 http://www.pinoytravelblog.com/

used to mark up 2 visibly continuous areas of main content

2 http://www.payproglobal.com/

used to mark up 2 visibly continuous areas of main content

2 http://www.nou-pou.gr/

used to mark up 2 continuous areas of main content

3 http://www.mtk.ru/

used to mark up 3 discontinuous areas of content difficult to know if its 'main content' or not, as site is in russian.

2 http://www.moneyguru.com.br/

use 2 mains (1 for mobile, 1 for desktop view) mobile hidden in desktop view

2 http://www.matito.ru/

use 1 for main content another for google search form in header area at top of page.

2 http://www.lifespa.com/

uses 1 main only, for main content.

2 http://www.krisaquino.net/

site no longer available

2 http://www.isavea2z.com/

already mentioned

2 http://www.ioucentral.com/

no <main>

2 http://www.india-topsites.com/

2 areas of visiby continuous main content

2 http://www.hmsa.com/

2 areas of visiby discontinuous main content

2 http://www.frugalfanatic.com/ nested main, no content between

2 http://www.french101.me/ http://www.freeanal.xxx/ 2 http://www.f-e.tw/

1 main only (marks up main content)

2 http://www.dailysquib.co.uk/

used to mark up 2 continuous areas of main content

2 http://www.couponsaregreat.net/

nested mains no content between

2 http://www.cosmopolitan.de/

1 for main content 1 for contact info in footer

2 http://www.christiankonline.com/

used to mark up 2 continuous areas of main content

haven't gone through the following:

2 http://www.campusinsiders.com/ 2 http://www.byui.edu/ 2 http://www.blockstream.com/ 2 http://www.bergbahn-kitzbuehel.at/ 2 http://www.bankbii.co/whitecard/ 2 http://www.affenblog.de/ 2 https://www.sensus-capital.com/en/

(Edited by @foolip to fix broken quoting/markup.)

foolip commented 9 years ago

For completeness, here are the few remaining that you didn't look at:

2 http://www.campusinsiders.com/

4 <main> used somewhere in navigation, can't find it visually, but doesn't seem sensible at all.

2 http://www.byui.edu/

One <main> nested inside another, single <main> would make more sense.

2 http://www.blockstream.com/

Two visually discontinuous bits marked up with <main>, looks kind of intentional, but including the part in between them seems like it would be an improvement.

2 http://www.bergbahn-kitzbuehel.at/

One <main> for desktop and one for mobile, with CSS used to hide one of them. Seems fine as long as AT will ignore the bits with display:none, which I think is the case.

2 http://www.bankbii.co/whitecard/

Both the navigation and main content are marked up with <main>, using something else for the navigation seems better.

2 http://www.affenblog.de/

Now just a single <main>, used appropriately.

2 https://www.sensus-capital.com/en/

A single <main>, but excluding what in my eyes is the "most main" content, namely "Sensus Capital is a fully regulated European online-broker ..."

foolip commented 9 years ago

http://www.kgbr.co.kr/flash/menuList.php actually does have 7 <main> elements when checked in devtools, but it doesn't look sensible.

stevefaulkner commented 9 years ago

7 http://www.kgbr.co.kr/flash/menuList.php

no visible page content, looking at code uses this:

Regards

SteveF Current Standards Work @W3C http://www.paciellogroup.com/blog/2015/03/current-standards-work-at-w3c/

On 28 September 2015 at 15:34, Philip Jägenstedt notifications@github.com wrote:

http://www.kgbr.co.kr/flash/menuList.php actually does have 7

elements when checked in devtools, but it doesn't look sensible.

— Reply to this email directly or view it on GitHub https://github.com/whatwg/html/issues/100#issuecomment-143760808.

foolip commented 9 years ago

Sorry, I thought I saw main_menu and main_target, but it was actually main elements, as you said.