skrapeit / skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
https://docs.skrape.it
MIT License
805 stars 59 forks source link

[QUESTION] cannot access website using BrowserFetcher #185

Closed Dawid-Witkowski closed 2 years ago

Dawid-Witkowski commented 2 years ago

describe what you want to archive I'm trying to scrape a certain website, let's say "https://www.aliexpress.com/" and I'm using this bit of code: skrape(BrowserFetcher) { request { url = "https://www.aliexpress.com/" } response { htmlDocument { Log.i("hello", this.wholeText) } } } In main activity inside of "onCreate" function (just for testing purposes (for now), thought this info might be useful in some way)

and it results in: java.lang.NoSuchFieldError: No static field INSTANCE of type Lorg/apache/http/conn/ssl/AllowAllHostnameVerifier; in class Lorg/apache/http/conn/ssl/AllowAllHostnameVerifier; or its superclasses (declaration of 'org.apache.http.conn.ssl.AllowAllHostnameVerifier' appears in /system/framework/framework.jar!classes4.dex) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.(SSLConnectionSocketFactory.java:151) at com.gargoylesoftware.htmlunit.httpclient.HtmlUnitSSLConnectionSocketFactory.buildSSLSocketFactory(HtmlUnitSSLConnectionSocketFactory.java:89) at com.gargoylesoftware.htmlunit.HttpWebConnection.configureHttpsScheme(HttpWebConnection.java:670) at com.gargoylesoftware.htmlunit.HttpWebConnection.createHttpClientBuilder(HttpWebConnection.java:586) at com.gargoylesoftware.htmlunit.HttpWebConnection.getHttpClientBuilder(HttpWebConnection.java:542) at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:172) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1596) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1518) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:493) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:413) at it.skrape.fetcher.BrowserFetcher.fetch(BrowserFetcher.kt:19) at it.skrape.fetcher.BrowserFetcher.fetch(BrowserFetcher.kt:10) at it.skrape.fetcher.FetcherConverter.fetch(Scraper.kt:30) at it.skrape.fetcher.Scraper.scrape(Scraper.kt:17) at it.skrape.fetcher.ScraperKt.response(Scraper.kt:87) at wingeddev.example.hello.MainActivity$onCreate$1.invokeSuspend(MainActivity.kt:28) at wingeddev.example.hello.MainActivity$onCreate$1.invoke(Unknown Source:8) at wingeddev.example.hello.MainActivity$onCreate$1.invoke(Unknown Source:4) at it.skrape.fetcher.ScraperKt$skrape$1.invokeSuspend(Scraper.kt:43) at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33) at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106) at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:274) at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:85) at kotlinx.coroutines.BuildersKtBuildersKt.runBlocking(Builders.kt:59) at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source:1) at kotlinx.coroutines.BuildersKtBuildersKt.runBlocking$default(Builders.kt:38) at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source:1) at it.skrape.fetcher.ScraperKt.skrape(Scraper.kt:42) at wingeddev.example.hello.MainActivity.onCreate(MainActivity.kt:24) at android.app.Activity.performCreate(Activity.java:7994) at android.app.Activity.performCreate(Activity.java:7978) at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1309) at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3422) at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3601) at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:85) at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135) at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95) at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2066) at android.os.Handler.dispatchMessage(Handler.java:106) at android.os.Looper.loop(Looper.java:223) at android.app.ActivityThread.main(ActivityThread.java:7656) at java.lang.reflect.Method.invoke(Native Method) at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:592) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:947)

but, if I use HttpFetcher it doesn't crash, is there any way I can fix this bug ? (I need to render the JS)

Dawid-Witkowski commented 2 years ago

also, my build.gradle looks like this: android { ... packagingOptions { exclude 'META-INF/DEPENDENCIES' } }

dependencies { ... implementation "it.skrape:skrapeit:1.2.0" implementation "com.squareup.okhttp3:okhttp:4.9.0" }

Dawid-Witkowski commented 2 years ago

Came back to this after 8 days, just had to change implementation "it.skrape:skrapeit:1.2.0" to implementation "it.skrape:skrapeit:1.2.1" and everything works like a charm :neutral_face: