AdamNiederer / elquery

Read and manipulate HTML in emacs
GNU General Public License v3.0
42 stars 5 forks source link

Add support for elquery-innertext #6

Closed andreas-marschke closed 4 years ago

andreas-marschke commented 6 years ago

A recursive function to retrieve all text out of a given node and its children.

This is akin to Node.innerText in the specification for the HTML living standard.

AdamNiederer commented 6 years ago

Thanks for the patch; this function looks useful! This should be good to merge after the conflicts are resolved and the parentheses are properly formatted.

alphapapa commented 6 years ago

This looks cool. May I suggest this alternate implementation (untested but I think it's correct)?

(defun elquery-innertext (node)
  "Return the text content of node and it's children.
This function will recurse down the tree given by NODE
and find and concatenate all :text property values."
  (cl-loop for child in (elquery-children node)
           for grandchildren = (elquery-children child)
           if grandchildren
           collect (string-join (elquery-innertext grandchildren) " ") into result
           else collect (plist-get child ':text) into result
           finally return (string-join result " ")))
AdamNiederer commented 4 years ago

Superceded by ef2961ccbc11054bbcd2e551c0d1c777362b5c69; elquery-full-text should perform the same function as this. If you notice any discrepancies between this function and elquery-full-text, feel free to open an issue :)