Both hickory.core/as-hiccup and hickory.core/as-hickory use DOM's wholeText property to extract the text value of a dom node. However, instead of just returning the text content of a node, this property concatenates all Text nodes logically adjacent to the node.
This may lead to unexpected results, particularly when a parsed document is modified before converting it into hiccup or hickory. Transpiling a mozilla example of using wholeText:
(def doc (hickory.core/parse "<p>Thru-hiking is great! <strong>No insipid election coverage</strong> However, <a href=\"http://en.wikipedia.org/wiki/Absentee_ballot\">casting a ballot</a> is tricky.</p>"))
(def para (.item (.getElementsByTagName doc "p") 0))
(.removeChild para (.item (.-childNodes para) 1))
After the removal of the strong element from the paragraph (as-hiccup para) now returns:
[:p {} "Thru-hiking is great! However, " "Thru-hiking is great! However, "
[:a {:href "http://en.wikipedia.org/wiki/Absentee_ballot"} "casting a ballot"] " is tricky."]
Notice the duplicate text caused by wholeText concatenating adjacent text nodes for the two text nodes remaining after removing the originally interjecting strong element.
A fix is to call goog.dom/getRawTextContent in place of the wholeText property accessors in hickory.core.
Both
hickory.core/as-hiccup
andhickory.core/as-hickory
use DOM's wholeText property to extract the text value of a dom node. However, instead of just returning the text content of a node, this property concatenates all Text nodes logically adjacent to the node.This may lead to unexpected results, particularly when a parsed document is modified before converting it into hiccup or hickory. Transpiling a mozilla example of using wholeText:
After the removal of the strong element from the paragraph
(as-hiccup para)
now returns:Notice the duplicate text caused by wholeText concatenating adjacent text nodes for the two text nodes remaining after removing the originally interjecting strong element.
A fix is to call
goog.dom/getRawTextContent
in place of the wholeText property accessors in hickory.core.