10up / Engineering-Best-Practices

10up Engineering Best Practices
https://10up.github.io/Engineering-Best-Practices/
MIT License
757 stars 205 forks source link

Guidance on <head> and <body> #194

Closed jonathantneal closed 7 years ago

jonathantneal commented 7 years ago

There is a distinctive lack of guidance regarding <head> and <body> tags on the web.

TL;DR: the <head> tag should probably be avoided. The <body> tag can be optional unless paired with an attribute.

Browsers already create <html>, <head>, <body> (I’m not recommending you omit <html>)

HTML parsers in all browsers — Blink in Chrome Canary, Trident in Internet Explorer 6, you name it — push content into a generated <head> until content is encountered. Content is any element it considers to be inline or block, which includes the <body> itself. From there, it pushes everything into the generated <body> tag. In effect, a developer does not create these tags, they merely replace them. We abuse this for side-effects sometimes.

Try this. Create an empty HTML document and open your dev tools. Next, log document.head, document.body, or document.documentElement. As you’ll see, these are generated tags. Check IE6 if you dare. Same deal.

Browsers toss out <head> for anything content-like

And anything non-content is <head>, too. Change that HTML document to this markup and open dev tools. Look at the parent of <title>:

<head></head><title>Test</title><body></body>

It gets better worse. Change that HTML document to this markup, open dev tools, and check the parent of <h1>:

<head><h1 hidden>SEO Title</h1><title>Title</title></head>

If you’re like me, you might be like “Wait, but I never even... WHY HTML WHY?!?

Browsers correctly organize metadata unless we say otherwise

Keep metadata before content. The browser will consider metadata part of the generated <head>. Metadata before content gets sorted as <head> and can be skipped over. This makes life easier on the parser, the renderer, and even external crawlers or validators. If you open a real or generated <body> tag too early, your rendering engine needs to start checking the computed style of metadata elements.

SEO folks have noticed this:

A rare case, but certainly one that can happen, is when coding errors cause the head section to end before it should. — http://searchengineland.com/canonical-tags-easy-right-whats-worst-happen-274635

From personal experience, this is not so rare. I usually see it when someone wants to hack at seo or misplaces an analytics script in the <head> that generates an element.

Google says “Wipe them out. All of them.

Google cares about markup, so they actually have some pretty simple advice:

For file size optimization and scannability purposes, consider omitting optional tags. — https://google.github.io/styleguide/htmlcssguide.html#Optional_Tags

Don’t think they mean <head> or <body>? Dude, look at their example — they are way more hardcore:

<!-- Not recommended -->
<!DOCTYPE html>
<html>
  <head>
    <title>Spending money, spending bytes</title>
  </head>
  <body>
    <p>Sic.</p>
  </body>
</html>
<!-- Recommended -->
<!DOCTYPE html>
<title>Saving money, saving bytes</title>
<p>Qed.

Forget everything you thought you knew about <head> and <body>

If you’re like me, this is all awesome to discover, kinda helpful to know, but really hard to accept. Google humorously acknowledges this immediately after that last example:

This approach may require a grace period to be established as a wider guideline as it’s significantly different from what web developers are typically taught. — https://google.github.io/styleguide/htmlcssguide.html#Optional_Tags

Thanks for reading! Open to questions and/or let me know if you’d like help drafting up something more formal.

jalbertbowden commented 7 years ago

what about the importance of meta charset? isn't that necessary to prevent cross-site scripting, amongst other things? also i don't see the html element in use in google's example with lang="" attribute which is a necessary in my opinion. not saying this is wrong, just confused about what exactly the point is...

jonathantneal commented 7 years ago

Google’s example is so minimal it raises the questions you ask. Let me know if these answers help.

On charsets and xss: It’s off topic of <head> and <body> but again Google’s example is so minimal. My two cents: some people add it with a <meta> tag, some people add it with headers, and some people add both. As long as the charset is there and it’s consistent with the content, awesome.

On lang: Google could have provided a second example where they use attributes like <html lang="en"> or <body class="no-js"> but I think they kept their example simple to make a specific point about elements developers and seo types blindly consider “sacred”.

Does that help clarify the point the Google styleguide was making?

jpdevries commented 7 years ago

My concern with encouraging omitting the <html> tag is that people will be even less likely to set the lang attribute, which should always be set. At our bootcamp we make sure students get in th habit of setting this.

Yes you can set lang on any attribute, but Ive never seen anyone using this "optimization" pattern do so. My two cents are that the reason browsers add these elements to the DOM is because they need to be there. So put them there. There are many things that are probably worth considering optimization wise before shaving a negligible amount off HTML weight.

jonathantneal commented 7 years ago

@jpdevries, you are not talking about my recommendation. You are reading into Google’s example. Is there a better way I could be more clear that I am not talking about languages, charsets, or the <html> element?

ivankruchkoff commented 7 years ago

There's a spec for html5 which specifies functionality: https://www.w3.org/TR/html5/document-metadata.html#the-head-element

A head element's start tag may be omitted if the element is empty, or if the first thing inside the head element is an element. A head element's end tag may be omitted if the head element is not immediately followed by a space character or a comment.

There are more browsers than just Chrome and more search engines than Google, especially when you move away from the English speaking world.

That said, the takeaways here that html / head / body can be removed, what's the benefit? Slightly smaller file size?

Final point, stripping un-needed tags is probably something that can be done as minification, the same way css/js is compiled, generated html can be minified to strip unneeded tags.

My approach would be to write the full html/head/body tags and then leave tag stripping to a plugin in WP.

jonathantneal commented 7 years ago

Thanks, @ivankristianto! I hope you saw that I mentioned this behavior applies to all browsers; UC Browser, IE6, etc. The benefit is about avoiding the gotchas of <head>’s weakness. Not many people check the organization of their metadata, so if you find it useful to know that <h1><p>SEO</p></h1> is not a heading, you might also want to know the things I’ve shared about <head>, too. If nothing else, this will now be a search result for future generations.

jpdevries commented 7 years ago

@jonathantneal I think you are being clear. I was referring to concerns with Google's example. One thing that might be worth examining I think

Create an empty HTML document and open your dev tools. Next, log document.head, document.body, or document.documentElement. As you’ll see, these are generated tags.

Web Developers write tags. Browsers make elements based off tags (or automagically as you suggest). So in that scenario, there is no body or head tag… is there? The browser is creating elements even though there is no corresponding tag in the document. So maybe this would make more sense?

As you’ll see, these are generated elements.

jalbertbowden commented 7 years ago

pretty sure the rendering engines go back and add them in for graceful degradation, as in correcting mistakes of the past. i think i get this now, and i can't say it is wrong, but to echo @jpdevries points, too much is already overlooked and ignored by the web community as a whole. accessibility is broken practically everywhere, and this example is a jumping off point to breaking it even more, as well as breaking internationalization, which we haven't even gotten to yet. last week i saw a bootcamp creator tweet out how his students asked him why he didn't use these two elements and his response is they don't matter. not only does that only apply in certain circumstances, its simply perpetuating this ridiculous cycle of who cares about markup and styles that is so prevalent. i know this is ranty, and i do apologize because this isn't my repo or project.

i will say that i do disagree with that example being kept short to prove to google engineer's points. google is the living embodiment of html/css is an afterthought team. typically, they do have a few good teams, but across the board, google's html/css is pretty bad. take this example, for example ;)

jonathantneal commented 7 years ago

@jpdevries, yessir, and I hope this discussion has been at the very least educational. @jalbertbowden, I think you make some great points. I like your rant! I do hope you’ve seen that <head> and <body> are optional, and I hope you’ve seen some of the issues you can run into when expecting <head> to work a certain way, but you are so right that we don’t want to encourage a culture of unnecessary minimalism, and we wouldn’t want developers taking this advice and throwing out any elements they don’t think matter. Google’s example leans that way, but I’m a sucker for citations because I have no clout of my own! IMHO, I’d rather devs accidentally write <button role="button" type="button"/> than <div onclick/>. I’ll let the 10up devs determine what is best, whether it should be closed, or if you’d like me to write up anything the team is comfortable with. Thanks everyone!

ivankruchkoff commented 7 years ago

Closing this out as I think we've looped through the full discussion on this topic. Feel free to re-open / comment if there's more to add or a relevant PR to link to.

Thanks @jpdevries, @jalbertbowden and @jonathantneal for the productive discussion.

JJJ commented 7 years ago

This is why being explicit is so important, and why lenient interpreters don't actually do anyone any favors.

Between PHP, JavaScript, and browser rendering engines, people are accidentally encouraged to write the the loosest and sloppiest version of something that works at least once, instead of writing the best version of something that works well for everyone always.

Technically, you can build a house without electricity, and maybe that is actually what you want, but no one will come visit you.

Do not omit these optional tags. 💜

jonathantneal commented 7 years ago

This is why being explicit is so important

Hmm? Why? You’ve forgotten your fact, and skipped to your conclusion. In fact, your entire post is a conclusion.

EDIT: I’m glad to review your new evidence. Please be careful with writing conclusions without evidence, is all.

JJJ commented 7 years ago

I haven't forgotten anything, and your reply comes across as quite rude and uncooperative, but I'll reply assuming that isn't the case.

This is a non-issue, in search of a problem, in search of debate, which will result in no change. If you're looking for reasons to do something, you'll keep inventing them.

These tags are not optional. Rather, they are so mandatory that browsers have built-in support for when people make mistakes and forget them. This is what computers are currently good at, and what people are not good at understanding about computers.

The question of whether developers writing code for the web should omit those tags is like asking for permission to do a bad job, to which the answer should always be "no."

The fact this is even questionable is it's own fact in my favor. If browsers didn't build in support for forgetful developers, and instead white-screened thanks to missing mandatory tags, that explicitness would have clearly communicated how important those tags are.

Other opinionated non-fact best practices:

(And if you still have a problem with my lack of whatever, ping me privately. You reached out for public discussion – if you don't like what you get, ask for clarification. If your next reply is anything like your last one, I'm kindly not going to engage any further.)

jpdevries commented 7 years ago

I understand this discussion is about <head> and <body> and not necessarily <html lang>. Not trying to derail but I just want to share this because it was helpful for me to understand the important of the lang attribute. The problem with this:

<!-- Recommended -->
<!DOCTYPE html>
<title>Saving money, saving bytes</title>
<p>Qed.

Is this https://github.com/TryGhost/Casper/issues/286#issuecomment-281950302

For i18n and screen reader support the lang attribute should always be set, and it is my understanding that when browsers add a phantom <html> they don't set lang.

JJJ commented 7 years ago

it is my understanding that when browsers add a phantom <html> they don't set lang

This is correct. It's also up to browser vendors to decide what magic they perform when a lang attribute is discovered, in addition to what the client website can perform with CSS/JS:

See: https://www.w3.org/International/questions/qa-lang-why

If test.html includes only this:

<div>TODO everything</div>

All current versions of Chrome, Safari, and Firefox will actually render:

<html><head></head><body><div>TODO write content</div></body></html>

Interesting that none of those three browsers force the <title> tag if it's forgotten.


Another aside, the title tag can be visibly styled. W3 states:

It is not normally displayed in the text of a document itself.

But we are free to display it (along with head) with CSS if we really wanted to be weird.

<style type="text/css">head, title { display: block; color: red; text-align: center; }</style>
<title>RED</title>
<div>TODO everything</div>

Note that styling only the title tag will still leave it hidden, as the head tag being hidden seems to override the title even if !important is used.


TL;DR - We can do a lot of junky stuff with these elements. 💯

tlovett1 commented 7 years ago

Love the discussion! Thanks for posting @jonathantneal. I think this is something to keep an eye but don't really see this being useful in our best practices.