skrapeit / skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
https://docs.skrape.it
MIT License
790 stars 57 forks source link

[BUG] Children of a DocElement from a relaxed Doc should also be relaxed. #223

Closed danisty closed 1 year ago

danisty commented 1 year ago

Describe the bug Getting the children of a Doc/DocElement will call the public constructor of DocElement and set relaxed to false, thus making all children throw an exception if a query selection fails.

Also happens for siblings, allElements and parents.

Code Sample

val doc = htmlDocument("<div>skrape<b>it</b></div>")
doc.relaxed = true

doc.findFirst("b").findFirst("a") // Will not throw an exception
doc.children[0].findFirst("a") // Will throw an exception
// it.skrape.selects.ElementNotFoundException: Could not find element "a"

https://github.com/skrapeit/skrape.it/blob/95c326f7eee0899c707f6f9bf2367a68fd80e502/html-parser/src/main/kotlin/it/skrape/selects/DomTreeElement.kt#L48-L50 https://github.com/skrapeit/skrape.it/blob/95c326f7eee0899c707f6f9bf2367a68fd80e502/html-parser/src/main/kotlin/it/skrape/selects/DocElement.kt#L8-L12

Expected behavior Inherit the relaxed property.

danisty commented 1 year ago

I'd like to reopen this issue since the merged pr doesn't fix it. The relaxed property should be passed to the DocElement constructor for the methods mentioned before: children, siblings, allElements and parents.

christian-draeger commented 1 year ago

Ok thanks for pointing this out 👍