This is a proposal for an open API that allows routing any HTTP traffic through your own custom client implementation.

The idea is that any external client has to fulfill the (remodeled) HttpFetcher interface. In particular, this interface is now typed based on the request class as such:

interface Fetcher<T> {
    fun fetch(request: T): Result
    val requestBuilder: T
}

The idea is that with so many different paradigms of how to build requests, skrape{it} would quickly navigate itself into a corner when trying to "unify" them all under one common "request" interface. Instead, we make a virtue out of diversity and let the user decide.

The typical skrape DSL will change as follows:

data class MyOwnFancyRequest(var amazingUrl: String, var astonishingMethod: HttpWowMethod, var whoNeedsProxiesAnyways: DumbProxy? = null)

class MyCoolCustomFetcherAdapter(val foo: FooConfig, val bar: BarSettings): Fetcher<MyOwnFancyRequest> // implementation here...

val fetcher: Fetcher<MyOwnFancyRequest> = MyCoolCustomFetcherAdapter(foo, bar)

val scraped = skrape(fetcher) {
    request {
       // you're operating in the scope of "MyOwnFancyRequest" here
       amazingUrl = "omg://wow.lol.xd"
    }

    extract {
        // same as before
    }
}

Essentially, this reduces the logic down to "dear fetcher, please give me a default request (val requestBuilder) that I may or may not modify in the DSL" and "dear fetcher, take this (optionally modified) request and execute it".

The existing BrowserFetcher and HttpFetcher implementations have been adapted to the suggested structure.

BREAKING CHANGES

mode obviously doesn't exist anymore.
All request configuration calls have to be wrapped in a request {} block (otherwise it would be impossible to infer the request type of the fetcher at top level in skrape {}
Every skrape {} call has to be passed its own client every time.

ToDo
Try to optimise the existing implementations based on this new infrastructure. Maybe distinguish between client-global configurations (sslRelaxed, proxy) and per-request information (url, http verb, etc.). This way, we don't have to spin up a new client per each request.
Create separate Maven artifacts for the custom implementations
Update readme (uaaah....)
More tests??
Coroutines? fun fetch(request: T) could be suspend, but then we would force the user to execute the top-level skrape {} DSL from within a coroutine context.

Any thoughts appreciated! :D

skrapeit / skrape.it

Add support for integration of custom clients #96

BREAKING CHANGES

ToDo