Closed jonathantneal closed 7 years ago
what about the importance of meta charset? isn't that necessary to prevent cross-site scripting, amongst other things? also i don't see the html element in use in google's example with lang="" attribute which is a necessary in my opinion. not saying this is wrong, just confused about what exactly the point is...
Google’s example is so minimal it raises the questions you ask. Let me know if these answers help.
On charsets and xss: It’s off topic of <head>
and <body>
but again Google’s example is so minimal. My two cents: some people add it with a <meta>
tag, some people add it with headers, and some people add both. As long as the charset is there and it’s consistent with the content, awesome.
On lang: Google could have provided a second example where they use attributes like <html lang="en">
or <body class="no-js">
but I think they kept their example simple to make a specific point about elements developers and seo types blindly consider “sacred”.
Does that help clarify the point the Google styleguide was making?
My concern with encouraging omitting the <html>
tag is that people will be even less likely to set the lang attribute, which should always be set. At our bootcamp we make sure students get in th habit of setting this.
Yes you can set lang on any attribute, but Ive never seen anyone using this "optimization" pattern do so. My two cents are that the reason browsers add these elements to the DOM is because they need to be there. So put them there. There are many things that are probably worth considering optimization wise before shaving a negligible amount off HTML weight.
@jpdevries, you are not talking about my recommendation. You are reading into Google’s example. Is there a better way I could be more clear that I am not talking about languages, charsets, or the <html>
element?
There's a spec for html5 which specifies functionality: https://www.w3.org/TR/html5/document-metadata.html#the-head-element
A head element's start tag may be omitted if the element is empty, or if the first thing inside the head element is an element. A head element's end tag may be omitted if the head element is not immediately followed by a space character or a comment.
There are more browsers than just Chrome and more search engines than Google, especially when you move away from the English speaking world.
That said, the takeaways here that html / head / body can be removed, what's the benefit? Slightly smaller file size?
Final point, stripping un-needed tags is probably something that can be done as minification, the same way css/js is compiled, generated html can be minified to strip unneeded tags.
My approach would be to write the full html/head/body tags and then leave tag stripping to a plugin in WP.
Thanks, @ivankristianto! I hope you saw that I mentioned this behavior applies to all browsers; UC Browser, IE6, etc. The benefit is about avoiding the gotchas of <head>
’s weakness. Not many people check the organization of their metadata, so if you find it useful to know that <h1><p>SEO</p></h1>
is not a heading, you might also want to know the things I’ve shared about <head>
, too. If nothing else, this will now be a search result for future generations.
@jonathantneal I think you are being clear. I was referring to concerns with Google's example. One thing that might be worth examining I think
Create an empty HTML document and open your dev tools. Next, log document.head, document.body, or document.documentElement. As you’ll see, these are generated tags.
Web Developers write tags. Browsers make elements based off tags (or automagically as you suggest). So in that scenario, there is no body or head tag… is there? The browser is creating elements even though there is no corresponding tag in the document. So maybe this would make more sense?
As you’ll see, these are generated elements.
pretty sure the rendering engines go back and add them in for graceful degradation, as in correcting mistakes of the past. i think i get this now, and i can't say it is wrong, but to echo @jpdevries points, too much is already overlooked and ignored by the web community as a whole. accessibility is broken practically everywhere, and this example is a jumping off point to breaking it even more, as well as breaking internationalization, which we haven't even gotten to yet. last week i saw a bootcamp creator tweet out how his students asked him why he didn't use these two elements and his response is they don't matter. not only does that only apply in certain circumstances, its simply perpetuating this ridiculous cycle of who cares about markup and styles that is so prevalent. i know this is ranty, and i do apologize because this isn't my repo or project.
i will say that i do disagree with that example being kept short to prove to google engineer's points. google is the living embodiment of html/css is an afterthought team. typically, they do have a few good teams, but across the board, google's html/css is pretty bad. take this example, for example ;)
@jpdevries, yessir, and I hope this discussion has been at the very least educational. @jalbertbowden, I think you make some great points. I like your rant! I do hope you’ve seen that <head>
and <body>
are optional, and I hope you’ve seen some of the issues you can run into when expecting <head>
to work a certain way, but you are so right that we don’t want to encourage a culture of unnecessary minimalism, and we wouldn’t want developers taking this advice and throwing out any elements they don’t think matter. Google’s example leans that way, but I’m a sucker for citations because I have no clout of my own! IMHO, I’d rather devs accidentally write <button role="button" type="button"/>
than <div onclick/>
. I’ll let the 10up devs determine what is best, whether it should be closed, or if you’d like me to write up anything the team is comfortable with. Thanks everyone!
Closing this out as I think we've looped through the full discussion on this topic. Feel free to re-open / comment if there's more to add or a relevant PR to link to.
Thanks @jpdevries, @jalbertbowden and @jonathantneal for the productive discussion.
This is why being explicit is so important, and why lenient interpreters don't actually do anyone any favors.
Between PHP, JavaScript, and browser rendering engines, people are accidentally encouraged to write the the loosest and sloppiest version of something that works at least once, instead of writing the best version of something that works well for everyone always.
Technically, you can build a house without electricity, and maybe that is actually what you want, but no one will come visit you.
Do not omit these optional tags. 💜
This is why being explicit is so important
Hmm? Why? You’ve forgotten your fact, and skipped to your conclusion. In fact, your entire post is a conclusion.
EDIT: I’m glad to review your new evidence. Please be careful with writing conclusions without evidence, is all.
I haven't forgotten anything, and your reply comes across as quite rude and uncooperative, but I'll reply assuming that isn't the case.
This is a non-issue, in search of a problem, in search of debate, which will result in no change. If you're looking for reasons to do something, you'll keep inventing them.
These tags are not optional. Rather, they are so mandatory that browsers have built-in support for when people make mistakes and forget them. This is what computers are currently good at, and what people are not good at understanding about computers.
The question of whether developers writing code for the web should omit those tags is like asking for permission to do a bad job, to which the answer should always be "no."
The fact this is even questionable is it's own fact in my favor. If browsers didn't build in support for forgetful developers, and instead white-screened thanks to missing mandatory tags, that explicitness would have clearly communicated how important those tags are.
Other opinionated non-fact best practices:
(And if you still have a problem with my lack of whatever, ping me privately. You reached out for public discussion – if you don't like what you get, ask for clarification. If your next reply is anything like your last one, I'm kindly not going to engage any further.)
I understand this discussion is about <head>
and <body>
and not necessarily <html lang>
. Not trying to derail but I just want to share this because it was helpful for me to understand the important of the lang
attribute. The problem with this:
<!-- Recommended -->
<!DOCTYPE html>
<title>Saving money, saving bytes</title>
<p>Qed.
Is this https://github.com/TryGhost/Casper/issues/286#issuecomment-281950302
For i18n and screen reader support the lang
attribute should always be set, and it is my understanding that when browsers add a phantom <html>
they don't set lang
.
it is my understanding that when browsers add a phantom
<html>
they don't setlang
This is correct. It's also up to browser vendors to decide what magic they perform when a lang
attribute is discovered, in addition to what the client website can perform with CSS/JS:
See: https://www.w3.org/International/questions/qa-lang-why
If test.html
includes only this:
<div>TODO everything</div>
All current versions of Chrome, Safari, and Firefox will actually render:
<html><head></head><body><div>TODO write content</div></body></html>
Interesting that none of those three browsers force the <title>
tag if it's forgotten.
Another aside, the title
tag can be visibly styled. W3 states:
It is not normally displayed in the text of a document itself.
But we are free to display it (along with head
) with CSS if we really wanted to be weird.
<style type="text/css">head, title { display: block; color: red; text-align: center; }</style>
<title>RED</title>
<div>TODO everything</div>
Note that styling only the title
tag will still leave it hidden, as the head
tag being hidden seems to override the title
even if !important
is used.
TL;DR - We can do a lot of junky stuff with these elements. 💯
Love the discussion! Thanks for posting @jonathantneal. I think this is something to keep an eye but don't really see this being useful in our best practices.
There is a distinctive lack of guidance regarding
<head>
and<body>
tags on the web.TL;DR: the
<head>
tag should probably be avoided. The<body>
tag can be optional unless paired with an attribute.Browsers already create
<html>
,<head>
,<body>
(I’m not recommending you omit<html>
)HTML parsers in all browsers — Blink in Chrome Canary, Trident in Internet Explorer 6, you name it — push content into a generated
<head>
until content is encountered. Content is any element it considers to be inline or block, which includes the<body>
itself. From there, it pushes everything into the generated<body>
tag. In effect, a developer does not create these tags, they merely replace them. We abuse this for side-effects sometimes.Try this. Create an empty HTML document and open your dev tools. Next, log
document.head
,document.body
, ordocument.documentElement
. As you’ll see, these are generated tags. Check IE6 if you dare. Same deal.Browsers toss out
<head>
for anything content-likeAnd anything non-content is
<head>
, too. Change that HTML document to this markup and open dev tools. Look at the parent of<title>
:It gets
betterworse. Change that HTML document to this markup, open dev tools, and check the parent of<h1>
:If you’re like me, you might be like “Wait, but I never even... WHY HTML WHY?!?”
Browsers correctly organize metadata unless we say otherwise
Keep metadata before content. The browser will consider metadata part of the generated
<head>
. Metadata before content gets sorted as<head>
and can be skipped over. This makes life easier on the parser, the renderer, and even external crawlers or validators. If you open a real or generated<body>
tag too early, your rendering engine needs to start checking the computed style of metadata elements.SEO folks have noticed this:
From personal experience, this is not so rare. I usually see it when someone wants to hack at seo or misplaces an analytics script in the
<head>
that generates an element.Google says “Wipe them out. All of them.”
Google cares about markup, so they actually have some pretty simple advice:
Don’t think they mean
<head>
or<body>
? Dude, look at their example — they are way more hardcore:Forget everything you thought you knew about
<head>
and<body>
If you’re like me, this is all awesome to discover, kinda helpful to know, but really hard to accept. Google humorously acknowledges this immediately after that last example:
Thanks for reading! Open to questions and/or let me know if you’d like help drafting up something more formal.