clj-commons / hickory

HTML as data
Other
637 stars 52 forks source link

Counterparts of child and descendant for first element in the chain #40

Open vspinu opened 8 years ago

vspinu commented 8 years ago

I have asked for this in #23 but I am afraid has-child doesn't cut it, so I am re-iterating it as a separate issue.

I would like to be able to get the first element in a chain of descendants. So if elements are td > tr > input the following would give me the td element:

(hs/parent 
   (hs/tag :td)
   (hs/tag :tr)
   (hs/tag :input))

Similarly hs/ancestor should be the counterpart of hs/descendant.

In this regard the api for same level siblings is complete as given by follow <> precede and folow-adjacent <> precede-adjacent.

I know about has-child, but the above simple snippet becomes:

(hs/and
  (hs/tag :td)
  (hs/and
    (hs/tag :tr)
    (hs/has-child 
      (hs/tag :input))))

which even in this simple case looks pretty daunting.

davidsantiago commented 8 years ago

I'm having a little trouble understanding your issue. But, the counterpart of descendant is has-descendant. The counterpart of child is has-child. All of the selectors are named to be relative to the node you're trying to select. I don't see why (has-child (tag :tr) (tag :input)) doesn't do what you want in that last example.

(I believe you're having an issue, of course, I'm just trying to explain why I'm not understanding what you're asking).

vspinu commented 8 years ago

I don't see why (has-child (tag :tr) (tag :input)) doesn't do what you want in that last example.

My last example had one has-child missing.

I guess my main issue is that you need to include (hs/and (hs/has-child for every level of hierarchy. Compare:

(hs/and
  (hs/tag :td)
  (hs/has-child
    (hs/and
      (hs/tag :tr)
      (hs/has-child 
        (hs/tag :input)))))

vs

(hs/parent
  (hs/tag :td) 
  (hs/tag :tr)
  (hs/tag :input))

I don't see has-child as natural counterpart of child because it accepts one selector but child accepts many. child is like a convenient threading macro but has-child is not. A currently missing function has-parent would be a natural counterpart of has-child.

port19x commented 1 year ago

has-child is indeed unergonomic and no fun to use!

Mertzenich commented 7 months ago

If I have time I hope to implement functional variants of parent and ancestor (I'll submit a PR if I end up getting around to it). In the meantime, here is a macro I whipped up to produce the has-child/has-descendent chains described above by @vspinu. It takes a function you want to utilize, such as has-child, followed by your selectors:

(defmacro has-*
  [sel & selectors]
  (if (empty? selectors)
    '(hickory.select/and)
    (let [rev (reverse selectors)]
      (loop [selectors (rest rev)
             output (list 'hickory.select/and (first rev))]
        (cond
          (empty? selectors) output
          (= (count selectors) 1) (list 'hickory.select/and (first selectors) (list sel output))
          :else (recur (rest selectors)
                       (list 'hickory.select/and (first selectors)
                             (list sel output))))))))

Example usage:

(macroexpand '(has-* hickory.select/has-child))
;; => (hickory.select/and)
;; Should match anything.
(macroexpand '(has-* hickory.select/has-child (sel/tag :td)))
;; => (hickory.select/and (sel/tag :td))
(macroexpand '(has-* hickory.select/has-child (sel/tag :td) (sel/tag :tr)))
;; => (hickory.select/and
;;     (sel/tag :td)
;;     (hickory.select/has-child (hickory.select/and (sel/tag :tr))))
(macroexpand '(has-* hickory.select/has-child (hs/tag :td) (hs/tag :tr) (hs/tag :input)))
;; => (hickory.select/and
;;     (sel/tag :td)
;;     (hickory.select/has-child
;;      (hickory.select/and
;;       (sel/tag :tr)
;;       (hickory.select/has-child (hickory.select/and (sel/tag :input))))))
Mertzenich commented 7 months ago

I believe I have a functional solution, I hope to test it sometime this weekend or early next week. It works in a similar way as the examples above, except this time we recursively combine the functions. I also need to dig into the code base a bit more since I worry that my solution may be reinventing the wheel a bit. I'll post more details as soon as I can.