skrapeit / skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
https://docs.skrape.it
MIT License
815 stars 59 forks source link

[QUESTION] how to implement library in android studio project? #145

Closed D3r3-k closed 3 years ago

D3r3-k commented 3 years ago

I was reading your documentation and implemented version 1.1.1, but it didn't work, then I tried 1.0.0 because I thought it would be stable and it didn't work either. I think that maybe I am implementing it wrong or I am missing a library to be able to use it.

in my build.gradle (app) I have this

plugins {
    id 'com.android.application'
    id 'kotlin-android'
}

android {
    compileSdkVersion 30
    buildToolsVersion "30.0.3"

    defaultConfig {
        applicationId "com.drefkai.skrapf"
        minSdkVersion 23
        targetSdkVersion 30
        versionCode 1
        versionName "1.0"

        testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner"
    }

    buildTypes {
        release {
            minifyEnabled false
            proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'
        }
    }
    compileOptions {
        sourceCompatibility JavaVersion.VERSION_1_8
        targetCompatibility JavaVersion.VERSION_1_8
    }

    kotlinOptions {
        jvmTarget = "1.8"
    }

    packagingOptions {
        exclude 'META-INF/DEPENDENCIES'
    }

}

dependencies {

    implementation "org.jetbrains.kotlin:kotlin-stdlib:$kotlin_version"
    implementation 'androidx.core:core-ktx:1.5.0'
    implementation 'androidx.appcompat:appcompat:1.3.0'
    implementation 'com.google.android.material:material:1.3.0'
    implementation 'androidx.constraintlayout:constraintlayout:2.0.4'
    testImplementation 'junit:junit:4.13.2'
    androidTestImplementation 'androidx.test.ext:junit:1.1.2'
    androidTestImplementation 'androidx.test.espresso:espresso-core:3.3.0'

    implementation("it.skrape:skrapeit:1.0.0")
}

It is a new project so I have absolutely nothing, it is blank, in the manifest I have permission to use the internet and as a code in my activity I was testing it with the same documentation page

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)

        val title = skrape(BrowserFetcher) {
            request {
                url = "https://docs.skrape.it/docs/"
            }
            extract {
                htmlDocument {
                    h1 {
                        withClass = "reset-3c756112--pageTitle-33dc39a3"
                    }
                }
            }
        }
        print(title)
    }

as the documentation does not have an example for the applications in android studio I did it like this because if or if it asks me to skrape this like this: skrape (<HttpFetcher|BrowserFetcher|AsyncFetcher>)

in the examples that I saw in the documentation they only put

skrape {
            url = "https://github.com/skrapeit"
            extract {
                MyScrapedData(
                    userName = element(".h-card .p-nickname").text(),
                    repositoryNames = elements("span.repo").map { it.text() }
                )
            }

when executing the code that I did in the onCreate I get this

E/AndroidRuntime: FATAL EXCEPTION: main
    Process: com.drefkai.skrapf, PID: 14170
    java.lang.RuntimeException: Unable to start activity ComponentInfo{com.drefkai.skrapf/com.drefkai.skrapf.MainActivity}: android.os.NetworkOnMainThreadException
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3449)
        at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3601)
        at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:85)
        at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
        at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
        at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2066)
        at android.os.Handler.dispatchMessage(Handler.java:106)
        at android.os.Looper.loop(Looper.java:223)
        at android.app.ActivityThread.main(ActivityThread.java:7656)
        at java.lang.reflect.Method.invoke(Native Method)
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:592)
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:947)
     Caused by: android.os.NetworkOnMainThreadException
        at android.os.StrictMode$AndroidBlockGuardPolicy.onNetwork(StrictMode.java:1605)
        at java.net.Inet6AddressImpl.lookupHostByName(Inet6AddressImpl.java:115)
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Inet6AddressImpl.java:103)
        at java.net.InetAddress.getAllByName(InetAddress.java:1152)
        at okhttp3.Dns$Companion$DnsSystem.lookup(Dns.kt:49)
        at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.kt:164)
        at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.kt:129)
        at okhttp3.internal.connection.RouteSelector.next(RouteSelector.kt:71)
        at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.kt:205)
        at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.kt:106)
        at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.kt:74)
        at okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall.kt:255)
        at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
        at okhttp3.internal.connection.RealCall.execute(RealCall.kt:154)
        at io.github.rybalkinsd.kohttp.dsl.HttpGetDslKt.httpGet(HttpGetDsl.kt:46)
        at it.skrape.fetcher.HttpFetcher.configuredClient(HttpFetcher.kt:67)
        at it.skrape.fetcher.HttpFetcher.fetch(HttpFetcher.kt:22)
        at it.skrape.fetcher.HttpFetcher.fetch(HttpFetcher.kt:18)
        at it.skrape.fetcher.FetcherConverter.fetch(Scraper.kt:30)
        at it.skrape.fetcher.Scraper.scrape(Scraper.kt:17)
        at it.skrape.fetcher.ScraperKt.extract(Scraper.kt:77)
        at com.drefkai.skrapf.MainActivity$onCreate$title$1.invokeSuspend(MainActivity.kt:21)
        at com.drefkai.skrapf.MainActivity$onCreate$title$1.invoke(Unknown Source:8)
        at com.drefkai.skrapf.MainActivity$onCreate$title$1.invoke(Unknown Source:4)
        at it.skrape.fetcher.ScraperKt$skrape$1.invokeSuspend(Scraper.kt:43)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
        at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:274)
        at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:84)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
        at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source:1)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
        at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source:1)
        at it.skrape.fetcher.ScraperKt.skrape(Scraper.kt:42)
        at com.drefkai.skrapf.MainActivity.onCreate(MainActivity.kt:17)
        at android.app.Activity.performCreate(Activity.java:8000)
        at android.app.Activity.performCreate(Activity.java:7984)
        at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1309)
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3422)
            ... 11 more
I/Process: Sending signal. PID: 14170 SIG: 9

Could you guide me a little more to know how to use the library and implement it for my application?

Thanks

christian-draeger commented 3 years ago

Hey, This is a problem in your app code. As written in the stacktrace this is because you are trying to make a network call in the main activity (the main aka Ui thread). This is bad practice in Android since it would block the main thread which would lead to an app freeze as long as the call takes - if it takes to long your app would be killed. That's the reason why newer Android versions don't allow this anymore by default.

You could deactivate the restriction as described in this stackoverflow answer - but that would be, as formerly described, very bad practice and can lead to unexpected behavior.

Clean solution would be to execute the network call in a background thread. To archive this I would suggest to use a viewmodel and do the network call in a coroutine.

Another pitfall that can happen is that you forget to add permission to use internet to your app. To do so you have to add <uses-permission android:name="android.permission.INTERNET"/> to your AndroidManifest.xml

I will try to provide a littel android app as documentation by example to the project within the next days.

D3r3-k commented 3 years ago

I understand, I see that the error was mine, I will try to do the code again. If I have internet permissions, if I notice that, and I'm sorry to bother, but I have doubts, for any project is it recommended to use the same SDK as in the emulator or does it not matter much?

if possible could you explain both for kotlin and java? Thanks for the information, I would really appreciate those examples

christian-draeger commented 3 years ago

hey @DrefKai i added an working android example --> https://github.com/skrapeit/skrape.it/blob/master/examples/android/README.md

hope it helps. if you have further questions regarding the library just let my know

D3r3-k commented 3 years ago

Thank you so much! I will try to do it to take practice, I appreciate the examples!