Shinmera / lquery

A Common Lisp library to allow jQuery-like HTML/DOM manipulation.
https://shinmera.github.io/lquery
zlib License
87 stars 8 forks source link

lQuery doesn't return text of elements in complex HTML documents #20

Closed fehmud closed 1 month ago

fehmud commented 1 month ago

lQuery doesn't return text of elements in complex HTML documents by using CSS selectors returned by browsers. jQuery in the JavaScript console doesn't work either, whilst CLSS does. Am I doing anything wrong?

For example, below is the code that I'm running on the HTML of lQuery's homepage - saved as "lQuery.html" - to get the documentation of FUNCTION ( TEXT NODE &OPTIONAL (TEXT NIL T-S-P) ). All calls to lQuery return #(), whilst calls to CLSS succeed.

Thanks for your attention.


(ql:quickload :lquery)

(defvar *html* (lquery:$ (initialize "lQuery.html")))

;; Using CSS Selector returned by Firefox
(lquery:$ *html*
          "#FUNCTION\\ LQUERY-FUNCS\\:TEXT > div:nth-child(2) > pre:nth-child(1)"
          (text))
;; => #()

;; Using selector returned by Chrome
(lquery:$ *html*
          "#FUNCTION\\ LQUERY-FUNCS\\:TEXT > div > pre"
          (text))
;; => #()

(setq *html* (plump:parse (uiop:read-file-string "Lquery.html")))

(plump:serialize (clss:select "#FUNCTION\\ LQUERY-FUNCS\\:TEXT > div:nth-child(2) > pre:nth-child(1)"
                              *html*))
;; => <pre>Get the combined text contents of each element [...]
      NIL

(plump:serialize (clss:select "#FUNCTION\\ LQUERY-FUNCS\\:TEXT > div > pre"
                              *html*))
;; => <pre>Get the combined text contents of each element [...]
      NIL
Shinmera commented 1 month ago

I can't reproduce. Works fine for me.

(lquery:$ *html* "#FUNCTION\\ LQUERY-FUNCS\\:TEXT > div:nth-child(2) > pre:nth-child(1)" (text))
; => #("Get the combined text contents of each element, including their descendants. If text is set, all text nodes are removed and a new text node is appended to the end of the node. If text is NIL, all direct text nodes are removed from the node. If text is not a string, it is transformed into one by PRINC-TO-STRING.")
Shinmera commented 1 month ago

Your mistake is passing a string to initialize. In that case it's assumed to be a string of HTML, not a path. Use a pathname, and it'll do what you expect.