jhy / jsoup

jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.
https://jsoup.org
MIT License
10.95k stars 2.19k forks source link

`:not(:has(` and `:has(:not(` are different #2166

Closed kvtb closed 4 months ago

kvtb commented 4 months ago

:not(:has( and :has(:not( work differently

actually, only :not(:has( does what is expected, while :has(:not( work exactly as :has(, ignoring the negation introduced by :not(

jhy commented 4 months ago

Can you give an example of each case and show what you expect vs what you get?

I'm trying both cases here: :not(:has) and :has(:not) and they appear OK there.

Source is here if you want to review & debug: https://github.com/jhy/jsoup/blob/970403cfb442d8c835c83e8f85b9384eb2f34390/src/main/java/org/jsoup/select/StructuralEvaluator.java#L129 and https://github.com/jhy/jsoup/blob/970403cfb442d8c835c83e8f85b9384eb2f34390/src/main/java/org/jsoup/select/StructuralEvaluator.java#L53

kvtb commented 4 months ago

import `org.jsoup:jsoup:1.17.2`, org.jsoup.Jsoup

object Test {
  def main(args: Array[String]): Unit = {
    val doc1 = Jsoup.parse("""<div id="jwplayer-1"> <div> <div class="jw-preview"/> </div> </div>""")

    println("1:" + doc1.select("""[id="jwplayer-1"]:not(:has([class*="jw-preview"]))"""))
    println("2:" + doc1.select("""[id="jwplayer-1"]:has(:not([class*="jw-preview"]))"""))
  }
}