skrapeit / skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
https://docs.skrape.it
MIT License
813 stars 59 forks source link

[FEATURE] Add nullable expressions #236

Closed xVemu closed 10 months ago

xVemu commented 11 months ago

Is your feature request related to a problem? Please describe. Kotlin has nullable types, so it's weird that findFirst function throws ElementNotFoundException instead of returning null. The same applies to findAll, why it can't return empty list if none element was found?

Describe the solution you'd like Add some function like findFirstOrNull and make findAll return empty list, when no elements were found.

Describe alternatives you've considered Currently, I'm using this as a workaround:

    private fun DocElement.findFirstOrNull(selector: String): DocElement? = try {
        findFirst(selector)
    } catch (e: ElementNotFoundException) {
        null
    }
christian-draeger commented 11 months ago

Please check relaxed parsing option as described here under point 4: https://docs.skrape.it/docs/dsl/extracting-data-from-websites

xVemu commented 11 months ago

Still throws error in this scenario:

fun main() = runBlocking {
    getLesson()
}

suspend fun getLesson() = withContext(Dispatchers.IO) {
    skrape(AsyncFetcher) {
        request {
            url = "https://google.com"
        }
        response {
            htmlDocument {
                relaxed = true
                findFirst("body") {
                    children.drop(1).forEach { row ->
                        print(row.findAll("xdddddd"))
                    }
                }
            }
        }
    }
}
christian-draeger commented 10 months ago

ok i see, this is related to the following issue:

https://github.com/skrapeit/skrape.it/issues/223 https://github.com/skrapeit/skrape.it/pull/227

will be closed as duplicate