whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
8.04k stars 2.64k forks source link

Clarification on the DOM being a tree #9189

Open MrHBS opened 1 year ago

MrHBS commented 1 year ago

Section 1.10 of the spec mentions the DOM as a tree:

HTML user agents (e.g., web browsers) then parse this markup, turning it into a DOM (Document Object Model) tree. A DOM tree is an in-memory representation of a document.

DOM trees contain several kinds of nodes, in particular a DocumentType node, Element nodes, Text nodes, Comment nodes, and in some cases ProcessingInstruction nodes.

However, in the DOM spec, the DOM is said to be an API for accessing and manipulating documents. So not a tree.

bathos commented 1 year ago

[HTML] HTML user agents [...] parse this markup, turning it into a DOM [...] tree.

[DOM] In its original sense, "The DOM" is an API for accessing and manipulating documents [...]. Each such document is represented as a node tree.

If parsing produces “a DOM tree,” it doesn’t follow that “DOM means tree,” and the DOM passage hints that “DOM” is used casually to refer to a range of things (“in its original sense” implies there are other senses), so as far as I can tell these passages are not contradictory. Note that both of these descriptions occur in non-normative introductory text which likely seeks to avoid overwhelming the reader with formalism; that’s what the rest of the specs are for.

MrHBS commented 1 year ago

@bathos Isn’t document tree the correct term here?

bathos commented 1 year ago

As far as I can tell, there’s no formalized term x for which “user agents parse this markup, turning it into x” would specifically be correct here. It’s a non-normative introductory overview trying to convey a high-level gist of the processing model in terms of intention rather than formalism. But perhaps “DOM document tree” or “DOM node tree” would be clearer regardless?

(I think the word DOM is important there, though, because that’s the term many devs already use for what the specs would refer to as “node trees”, and this passage is geared towards a more general audience.)

MrHBS commented 1 year ago

Thank you very much Bathos. I think it is clear now.

MrHBS commented 1 year ago

So I suppose you can say “A DOM tree is a document tree produced from parsing HTML documents that represents the document and is stored in memory”

annevk commented 1 year ago

FWIW, I think it's worth changing the phrasing from "DOM trees" to "node trees" and xref the DOM Standard.

MrHBS commented 1 year ago

@annevk 1- Don't you think we should be more accurate here and say “document tree”? After all this the type of node tree you get from parsing HTML, if my understanding is correct.

2- I think Bathos made a very good point when he said the term “DOM tree” is popular among devs. I think we should still define this term somewhere (maybe in the DOM spec?)

annevk commented 1 year ago

Some instances could maybe use document tree. It's not more accurate per se, it just asserts that the root is a document, which isn't the case for everything HTML does.

There are a lot of things that are popular that are not necessarily correct. Some we say something about, e.g., "HTML5", but I'm not convinced DOM tree is up there. DOM as a term is used for a lot of things, so avoiding it and using more precise terms seems better.