w3c / html

Deliverables of the HTML Working Group until October 2018
https://w3c.github.io/html/
Other
1.97k stars 547 forks source link

allow div to accept more kinds of content or add a new element #355

Closed Nick-Levinson closed 7 years ago

Nick-Levinson commented 8 years ago

It helps in organizing a page and applying CSS to have an element that can be used for dividing the source code in the page. The issue with HTML 5.1 is that what elements are included within such a division should not matter. Comments can be used but that won't allow minimization (diagnosis often depends on examining the source code received over the Internet, not preproduction files).

I use the div element but was recently reminded by a validator (https://validator.w3.org/nu/) not to use it for nonflow content. That probably also excludes use for nonflow descendants that are not children. The issue was with a div element that was only for a script. That div is within another div, defining a column. The divs have CSS. I'd rather not take the script out of a div if doing so will take it out of the website's look produced by CSS.

In HTML 5.1, either the div element should be redefined to allow use for any kind of content or a new all-content-type element should be defined so as to be consistent with the purpose of organizing source code within the body element. As it is, HTML 5.1 in section 3.2.4 permits other specs to require an element be used elsewhere than allowed by HTML 5.1, which means non-HTML use is not barred but probably depends on browser designers having a reason to recognize it. At the least, if some content types would be a problem for a div, either those types should be explicitly enumerated or more types should be explicitly allowed.

I originally posted essentially this at https://www.w3.org/Bugs/Public/show_bug.cgi?id=29537 where it was set to WONTFIX, but without a reason or question and with an invitation to reopen here.

prlbr commented 8 years ago

I’d be interested to see sample HTML code that illustrates the problem you describe, as I’m not sure I understand it correctly.

(Either <div> – where flow content is expected – or <span> – where phrasing content is expected – have always worked for me.)

Nick-Levinson commented 8 years ago

Sure, but why not one element for all kinds of content? It would be easier to use just one and and it would pass validation, which I assume would make it easier for some user agents. My invalid code was essentially (braces here representing angle brackets) {div}{script}. . .{/script}{/div} and, given HTML 5.1, perhaps it has to do with the kind of script. I corrected my coding and I don't think I have the validator's analysis anymore. As you indicated, the span element has a comparable drawback. Besides using comments for this purpose, we could also consider gobs of whitespace, but both of those are subject to minification or page-loading slowness (according to Google, which encourages speed and therefore minification).

This isn't a high priority. Sites won't break without it. But it would help authors. And it doesn't even need a new element, since it appears div can be redefined for this without causing a technical problem.

ZoeBijl commented 8 years ago

I'm struggling to understand the issue. Is it that you want to use div to organise stuff in the head? And if you have so much stuff in there that you need to organise it; I wouldn't worry about speed so much.

prlbr commented 8 years ago

I’ve made a test file that uses <div><script>…</script></div> and the validator does not report any errors or warnings to me.

And it doesn't even need a new element, since it appears div can be redefined for this without causing a technical problem.

<div> defaults to being displayed as block element, so it introduces what would be perceived as line break. This cannot be redefined without altering millions of websites. So people who want to use <div> everywhere, i.e. including inline in a text paragraph, would have to change the element’s display property via CSS, but I don’t see how this would be easier than using <span>.

Changing <span> so that it can contain flow content might cause less trouble, but I’m not sure about that.

Are people confused by when to use <div> and when to use <span> though? I can see that having just one element for everywhere would be easier in some sense because you wouldn’t have to pay attention to the difference between flow and phrasing content. But on the other hand you won’t get around understanding that difference when coding HTML by hand anyway because it is relevant to most elements as it is now.

Besides using comments for this purpose, we could also consider gobs of whitespace, but both of those are subject to minification or page-loading slowness (according to Google, which encourages speed and therefore minification).

When you are looking for something that does not behave like an element at all (e.g. does not need to take part in the Document Object Model, isn’t used for applying CSS) but is just a tool for an author to structure the source code, then comments and whitespace seem fine. I don’t understand why their removal during minification would be a problem though.

I’m sorry if I have missed your point. If so, then maybe you can explain it again concentrating more on the problem that needs to be solved rather than on a solution. Sample code can help.

stevefaulkner commented 8 years ago

@prlbr wrote:

I’m sorry if I have missed your point. If so, then maybe you can explain it again concentrating more on the problem that needs to be solved rather than on a solution. Sample code can help.

Sage advice

Nick-Levinson commented 8 years ago

When a page has a lot of code, being able to find things in the source code, not just being able to apply style to large runs of code, is made a lot easier with an element that is easy to find (I use the attributes id and class a lot with div and other elements). For example, Google requires some elements that are not intuitive for page organization, so I enclose them within divs I'll more likely recognize. My layout uses five major areas; the divs remind me which elements are in which major area and which subarea. I don't use divs or spans in the head, only in the body, which I divide into divs (I use spans for styling).

I found a script that fails validation. The validator with text input reported "Element gcse:searchbox-only not allowed as child of element div in this context." and "Content model for element div:"/"Flow content." (On another line, the validator also said, "Element name gcse:searchbox-only cannot be represented as XML 1.0.") That element is supplied by Google, so, if I want a Google Custom Search Engine box on my website, I probably can't change it. (I can't reproduce the script here without also reproducing the entire Apache license and that's lengthy, but the script is on most pages at cold32.com (ignore other Google scripts).)

I ran your test file (by address and by text input) through the validator I used back then (in case you used a different one) and it came up clean, so maybe the issue was the kind of script on my site, and I don't know enough to know about kinds of scripts. But my page had failed for me because of a script being inside a div element, according to the validation report, so I don't understand the discrepancy.

Introducing or changing a line break is not my intention. I don't remember seeing a line break due to a div element on my pages, but maybe that's a function of my layouts. I was reading that the spec allows other specs to define other uses for a div, but the issue you raise means a new element, immune to minification and able to take any content, would help.

The problem with minification is that if a page is not rendering as intended then you'd diagnose it, and for that you'd look at the source code as received at the user agent or as uploaded to the Web host. To diagnose using a preprocessing file means you don't see if pre-upload processing caused a rendering error. Thus, if you apply minification but the page is rendering improperly, you want to check the file as uploaded, thus the minified file, which, if it was organized using comments and whitespace which have been stripped out, would now be harder to read and diagnose. For example, see the source code for Google's Web search page (https://www.google.com/#gws_rd=ssl), even though it's not broken, and even ignoring the scripts themselves; that page appears minified and, if it was, is thus harder to eyeball. Not finding an error is the risk.

prlbr commented 8 years ago

@Nick-Levinson, I think your strategy of using IDs or classes as markers in the source code is a legitimate use case which should always work.

I found a script that fails validation. The validator with text input reported "Element gcse:searchbox-only not allowed as child of element div in this context." and "Content model for element div:"/"Flow content."

I think the problem might rather be that <gcse:searchbox-only> is non-standard than something being wrong with <div>. Maybe someone more knowledgable than I can give more feedback on this.

(On another line, the validator also said, "Element name gcse:searchbox-only cannot be represented as XML 1.0.")

Here’s an explanation for that warning; look for the answer by Michael[tm] Smith: http://comments.gmane.org/gmane.org.w3c.validator/13353

That element is supplied by Google, so, if I want a Google Custom Search Engine box on my website, I probably can't change it.

Google actually has a note on the following site that promises a HTML5 compatible solution. Does that work for you? https://developers.google.com/custom-search/docs/element#html5

Introducing or changing a line break is not my intention. I don't remember seeing a line break due to a div element on my pages, but maybe that's a function of my layouts.

Yes, you won’t notice the line-breaking property when you use <div> at places where line breaks are expected anyway. But you’ll see them if you put a <div> where you don’t want a break. I’ve made a short test file that illustrates it: https://prlbr.de/2016/05/inlinediv.html

But <span> could be used in that case without causing line breaks.

Thus, if you apply minification but the page is rendering improperly, you want to check the file as uploaded, thus the minified file, which, if it was organized using comments and whitespace which have been stripped out, would now be harder to read and diagnose.

On a side note, have you checked the developer tools offered by some browsers? Firefox and Safari (and probably other browsers as well) not only offer an option to see the source code, but also the generated DOM tree in a neatly organized way. That can help a lot with diagnosing errors. Here’s information on Firefox’s page inspector: https://developer.mozilla.org/en-US/docs/Tools/Page_Inspector

Nick-Levinson commented 8 years ago

I hadn't thought of using id or class in an element-independent way for page organization; I guess I would add a little more code to ease searchability and that could work. While gcse:searchbox-only is nonstandard, having an HTML element that can cope with whatever is inside the element would help. I don't include XML unless someone supplies necessary code that uses a non-HTML language; I was noting the instance only for possible validation context. I already use a version of Google's advice for a div element; my id value is different and I don't use the other attributes, but testing Google's version made no difference to validation (still invalid as not flow content). Your div test is good; I don't use divs inline and you're right and spans would work there (I'd prefer to use the same thing for both uses). On the utility of the DOM and DOM tools, I take your word but I haven't studied DOM or gotten used to using it and I wonder how many other Web designers also don't use it.

prlbr commented 8 years ago

While gcse:searchbox-only is nonstandard, having an HTML element that can cope with whatever is inside the element would help.

Browsers display the search box within the <div> as you would expect, I think. That means browsers cope with this. The validator reporting an error is not a problem in my opinion, but the desired behaviour when encountering invalid code, which <gcse:searchbox-only> apparently is.

Maybe the validator’s error message could be improved so that it is clearer that the error is caused by <gcse:searchbox-only> and not by <div> being not universal enough?

I already use a version of Google's advice for a div element; my id value is different and I don't use the other attributes, but testing Google's version made no difference to validation (still invalid as not flow content).

I’m sure that Google’s HTML5-valid alternative code is indeed valid HTML. Note that Google’s <div class="gcse-searchbox"> code replaces <gcse:searchbox-only>, it doesn’t enclose it.

Nick-Levinson commented 8 years ago

The page renders properly, including the search box; I could delete most of my div elements without affecting page rendering, as only a few are needed for intended rendering. This issue is about clarity in source code organization, useful for finding code and diagnosing problems.

Validation is not perfect in either direction. Some browsers don't render HTML5 fully and properly and validation won't predict those problems. And sometimes nonstandard coding that fails validation is necessary, such as br elements in some li list items because in certain layouts I get peculiar harder-to-read results if I don't have br elements, but the br elements in that context fail validation. However, in general, validation predicts proper rendering.

It would be odd to ask W3C to change their validator to ignore a criterion in W3C's HTML, in this case that a div element is only for certain kinds of content. It's probably not a good idea to change the validator that way and W3C probably won't agree. It might be feasible to reprogram the validator to let users select criteria to apply, much as a word processor's grammar checker may let a user choose which syntactical criteria to apply. That kind of reprogramming would be a lot of work, but if it's done then one could make an exception for div elements for content type. Selectivity for many or most criteria would be useful, but to exclude just one criterion, even with a legend saying so, would not be useful.

Google's HTML5 code need not have passed validation to be recommended or prescribed by Google. It only had to cause the search functions to work.

I'm not clear how a div element can replace its content, when the content is present.

prlbr commented 8 years ago

This issue is about clarity in source code organization, useful for finding code and diagnosing problems.

My problem is that I don’t see the issue. The two examples for the current situation not being sufficient don’t convince me (<script> in a <div> is valid and <gcse:searchbox-only> being reported as error seems to be desired behaviour). So I don’t understand for which use cases we need to change HTML according to your proposal.

Maybe I’m somewhat blind here. I’d be happy if someone else chimed in to give feedbak from another perspective.

It would be odd to ask W3C to change their validator to ignore a criterion in W3C's HTML, in this case that a div element is only for certain kinds of content.

I agree. My suggestion wasn’t to ignore an error, but to make the error message easier to understand. You are correct that <div>s are only for certain kind of content, i.e. flow content, but flow content actually encompass nearly everything that is allowed in a document’s body. So it is almost universal. http://w3c.github.io/html/dom.html#kinds-of-content-flow-content

The working group could consider custom elements to be added to the list of flow content – like the alternative WHATWG standard does – but I think those current special Google tags wouldn’t qualify as valid custom element names either. https://html.spec.whatwg.org/multipage/dom.html#flow-content https://html.spec.whatwg.org/multipage/scripting.html#valid-custom-element-name

The following is rather off-topic, but I hope it helps, @Nick-Levinson:

And sometimes nonstandard coding that fails validation is necessary, such as br elements in some li list items because in certain layouts I get peculiar harder-to-read results if I don't have br elements, but the br elements in that context fail validation.

The <br> elements would not cause validation errors if you put them inside instead of in between <li>s (that means <li>…</li><br><li>…</li> is invalid, but <li>…<br></li><li>…</li> is syntactically valid).

However, as you say the trouble is with layout, the most appropriate solution might be to use no <br> at all and instead use styling on the <li>s, for example <li style="margin-bottom:1em">.

Another option would be to work with paragraphs, e.g. <li><p>…</p><p>…</p></li><li><p>…</p></li>. Which solution is most appropriate depends on what you’re using the list for.

I'm not clear how a div element can replace its content, when the content is present.

What I wanted to express is that for applying Google’s HTML5 compatible code you have to remove <gcse:searchbox-only> from your page’s source code and work with the <div> version of their searchbox only. But of course you can also keep <gcse:searchbox-only> and just ignore the validation error.

Nick-Levinson commented 8 years ago

The li problem occurred when two consecutive li items had very short content for each (I think 3-6 characters each); a browser then rendered both on one line, and I'm told that was proper rendering, but I wanted the list items to be visually separate. I tried adding the 1em margin-bottom into a stylesheet without the br element or with the br element before the li closing tag and the problem persisted, a problem that was solved when the br element came after the li closing tag with or without the margin-bottom style. I also tried enclosing all the text that was inside the li element within a p element within the li element with no br element anywhere in the ul element but the p didn't help. There may be other style factors involved, such as the use of columns to flow the list, so replication may be complicated, but thanks for the suggestions. And, on-topic or nearly so, at any rate, there probably are other cases in which something fails validation but is necessary to achieve an acceptable and desired effect in rendering.

The validator's statement "Element gcse:searchbox-only not allowed as child of element div in this context." may have said more than necessary for a validation failure, since HTML5 does not recognize an element with that name for any context or in relation to any other element. If it did say more than necessary, that could be an inclarity problem. I'm not disputing that the element itself should fail HTML validation regardless of how used.

Maybe the stuff that a div element is not supposed to be used for can be allowed? Apparently that would be okay if another spec said so, so I'm not clear why the restriction is needed at all in HTML even if no other such spec can be identified.

LJWatson commented 7 years ago

It seems like this issue can be closed? Ping @stevefaulkner for opinion.

stevefaulkner commented 7 years ago

@LJWatson yes, needs incubation.

chaals commented 7 years ago

See also https://discourse.wicg.io/t/relaxing-author-requirements-to-allow-arbitrary-content-in-span/1765 which I think addresses the same issue - except proposes using span instead of div.