skrapeit / skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
https://docs.skrape.it
MIT License
789 stars 57 forks source link

[BUG] BrowserFetcher not working on Android #180

Closed p4ulor closed 2 years ago

p4ulor commented 2 years ago

Hey, im new to this, can you help me get the HTML of a whole page, and if you can, also help me parse it into objects?

I basically want to get all the nutritional information in these tables. pic

And I also need to make sure 100g is selected pic2

Here is my code But it's not working, I get error "No static field INSTANCE..." I'm using your code @here

christian-draeger commented 2 years ago

Sure. Could you provide the url you want to scrape from? Then I will try to build a little demo :)

EDIT: found url in Screenshot. Will try tomorrow or at least on Monday

christian-draeger commented 2 years ago

ok i just checked. since the values you want to extract rely on javascript you will need to use BrowserFetcher. since the dropdown will need a click to change numbers to per 100g it will not be possible to do this using skrape{it} for now since it is only html parser. but i have plans to make this possible in the near future.

beside from that i could imagine a solution like this, which will give you a map with the category name as key and a list of entries as value:

data class Entry(
    val name: String,
    val amount: String,
    val unit: String,
    val percentDv: String,
)

fun getNuts() = skrape(BrowserFetcher) {
    request {
        url = "https://nutritiondata.self.com/facts/nut-and-seed-products/3086/2"
        timeout = 20_000
    }
    response {
        htmlDocument {
            "#NutritionInformationSlide .m-t13" {
                findAll {
                    map {
                        it.div {
                            withAttribute = "align" to "center"
                            findFirst {
                                text
                            }
                        } to it.div {
                            withClass = "clearer"
                            findAll {
                                associate { entry ->
                                    Entry(
                                        name = entry.div {
                                            withClass = "nf1"
                                            0 { text }
                                        },
                                        amount = entry.div {
                                            withClass = "nf2"
                                            0 { text }
                                        },
                                        unit = entry.div {
                                            withClass = "nf3"
                                            0 { text }
                                        },
                                        percentDv = entry.div {
                                            withClass = "nf4"
                                            0 { text }
                                        }
                                    )
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

fun main() {
    getNuts().forEach(::println)
}

which is printing:


(Calorie Information, [Entry(name=Calories, amount=842, unit=(3525 kJ), percentDv=42%), Entry(name=From Carbohydrate, amount=118, unit=(494 kJ), percentDv=), Entry(name=From Fat, amount=614, unit=(2571 kJ), percentDv=), Entry(name=From Protein, amount=110, unit=(461 kJ), percentDv=), Entry(name=From Alcohol, amount=0.0, unit=(0.0 kJ), percentDv=)])
(Carbohydrates, [Entry(name=Total Carbohydrate, amount=28.9, unit=g, percentDv=10%), Entry(name=Dietary Fiber, amount=15.1, unit=g, percentDv=60%), Entry(name=Starch, amount=1.5, unit=g, percentDv=), Entry(name=Sugars, amount=7.2, unit=g, percentDv=), Entry(name=Sucrose, amount=6931, unit=mg, percentDv=), Entry(name=Glucose, amount=58.0, unit=mg, percentDv=), Entry(name=Fructose, amount=0.0, unit=mg, percentDv=), Entry(name=Lactose, amount=0.0, unit=mg, percentDv=), Entry(name=Maltose, amount=203, unit=mg, percentDv=), Entry(name=Galactose, amount=~, unit=, percentDv=)])
(Fats & Fatty Acids, [Entry(name=Total Fat, amount=73.4, unit=g, percentDv=113%), Entry(name=Saturated Fat, amount=5.6, unit=g, percentDv=28%), Entry(name=4:00, amount=0.0, unit=mg, percentDv=), Entry(name=6:00, amount=0.0, unit=mg, percentDv=), Entry(name=8:00, amount=0.0, unit=mg, percentDv=), Entry(name=10:00, amount=0.0, unit=mg, percentDv=), Entry(name=12:00, amount=0.0, unit=mg, percentDv=), Entry(name=13:00, amount=0.0, unit=mg, percentDv=), Entry(name=14:00, amount=0.0, unit=mg, percentDv=), Entry(name=15:00, amount=0.0, unit=mg, percentDv=), Entry(name=16:00, amount=4720, unit=mg, percentDv=), Entry(name=17:00, amount=0.0, unit=mg, percentDv=), Entry(name=18:00, amount=921, unit=mg, percentDv=), Entry(name=19:00, amount=~, unit=, percentDv=), Entry(name=20:00, amount=0.0, unit=mg, percentDv=), Entry(name=22:00, amount=0.0, unit=mg, percentDv=), Entry(name=24:00:00, amount=0.0, unit=mg, percentDv=), Entry(name=Monounsaturated Fat, amount=46.8, unit=g, percentDv=), Entry(name=14:01, amount=0.0, unit=mg, percentDv=), Entry(name=15:01, amount=~, unit=, percentDv=), Entry(name=16:1 undifferentiated, amount=349, unit=mg, percentDv=), Entry(name=16:1 c, amount=~, unit=, percentDv=), Entry(name=16:1 t, amount=~, unit=, percentDv=), Entry(name=17:01, amount=~, unit=, percentDv=), Entry(name=18:1 undifferentiated, amount=46467, unit=mg, percentDv=), Entry(name=18:1 c, amount=~, unit=, percentDv=), Entry(name=18:1 t, amount=~, unit=, percentDv=), Entry(name=20:01, amount=0.0, unit=mg, percentDv=), Entry(name=22:1 undifferentiated, amount=0.0, unit=mg, percentDv=), Entry(name=22:1 c, amount=~, unit=, percentDv=), Entry(name=22:1 t, amount=~, unit=, percentDv=), Entry(name=24:1 c, amount=0.0, unit=mg, percentDv=), Entry(name=Polyunsaturated Fat, amount=17.5, unit=g, percentDv=), Entry(name=16:2 undifferentiated, amount=~, unit=, percentDv=), Entry(name=18:2 undifferentiated, amount=17477, unit=mg, percentDv=), Entry(name=18:2 n-6 c,c, amount=~, unit=, percentDv=), Entry(name=18:2 c,t, amount=~, unit=, percentDv=), Entry(name=18:2 t,c, amount=~, unit=, percentDv=), Entry(name=18:2 t,t, amount=~, unit=, percentDv=), Entry(name=18:2 i, amount=~, unit=, percentDv=), Entry(name=18:2 t not further defined, amount=~, unit=, percentDv=), Entry(name=18:03, amount=0.0, unit=mg, percentDv=), Entry(name=18:3 n-3, c,c,c, amount=~, unit=, percentDv=), Entry(name=18:3 n-6, c,c,c, amount=~, unit=, percentDv=), Entry(name=18:4 undifferentiated, amount=0.0, unit=mg, percentDv=), Entry(name=20:2 n-6 c,c, amount=0.0, unit=mg, percentDv=), Entry(name=20:3 undifferentiated, amount=0.0, unit=mg, percentDv=), Entry(name=20:3 n-3, amount=~, unit=, percentDv=), Entry(name=20:3 n-6, amount=~, unit=, percentDv=), Entry(name=20:4 undifferentiated, amount=0.0, unit=mg, percentDv=), Entry(name=20:4 n-3, amount=~, unit=, percentDv=), Entry(name=20:4 n-6, amount=~, unit=, percentDv=), Entry(name=20:5 n-3, amount=0.0, unit=mg, percentDv=), Entry(name=22:02, amount=~, unit=, percentDv=), Entry(name=22:5 n-3, amount=0.0, unit=mg, percentDv=), Entry(name=22:6 n-3, amount=0.0, unit=mg, percentDv=), Entry(name=Total trans fatty acids, amount=~, unit=, percentDv=), Entry(name=Total trans-monoenoic fatty acids, amount=~, unit=, percentDv=), Entry(name=Total trans-polyenoic fatty acids, amount=~, unit=, percentDv=), Entry(name=Total Omega-3 fatty acids, amount=~, unit=, percentDv=), Entry(name=Total Omega-6 fatty acids, amount=17477, unit=mg, percentDv=)])
(Protein & Amino Acids, [Entry(name=Protein, amount=31.8, unit=g, percentDv=64%), Entry(name=Tryptophan, amount=287, unit=mg, percentDv=), Entry(name=Threonine, amount=1015, unit=mg, percentDv=), Entry(name=Isoleucine, amount=1035, unit=mg, percentDv=), Entry(name=Leucine, amount=2200, unit=mg, percentDv=), Entry(name=Lysine, amount=899, unit=mg, percentDv=), Entry(name=Methionine, amount=281, unit=mg, percentDv=), Entry(name=Cystine, amount=422, unit=mg, percentDv=), Entry(name=Phenylalanine, amount=1718, unit=mg, percentDv=), Entry(name=Tyrosine, amount=793, unit=mg, percentDv=), Entry(name=Valine, amount=1196, unit=mg, percentDv=), Entry(name=Arginine, amount=3692, unit=mg, percentDv=), Entry(name=Histidine, amount=886, unit=mg, percentDv=), Entry(name=Alanine, amount=1498, unit=mg, percentDv=), Entry(name=Aspartic acid, amount=4090, unit=mg, percentDv=), Entry(name=Glutamic acid, amount=7739, unit=mg, percentDv=), Entry(name=Glycine, amount=2197, unit=mg, percentDv=), Entry(name=Proline, amount=1450, unit=mg, percentDv=), Entry(name=Serine, amount=1504, unit=mg, percentDv=), Entry(name=Hydroxyproline, amount=~, unit=, percentDv=)])
(Vitamins, [Entry(name=Vitamin A, amount=10.2, unit=IU, percentDv=0%), Entry(name=Retinol, amount=0.0, unit=mcg, percentDv=), Entry(name=Retinol Activity Equivalent, amount=0.0, unit=mcg, percentDv=), Entry(name=Alpha Carotene, amount=0.0, unit=mcg, percentDv=), Entry(name=Beta Carotene, amount=5.8, unit=mcg, percentDv=), Entry(name=Beta Cryptoxanthin, amount=0.0, unit=mcg, percentDv=), Entry(name=Lycopene, amount=0.0, unit=mcg, percentDv=), Entry(name=Lutein+Zeaxanthin, amount=1.5, unit=mcg, percentDv=), Entry(name=Vitamin C, amount=0.0, unit=mg, percentDv=0%), Entry(name=Vitamin D, amount=~, unit=, percentDv=~), Entry(name=Vitamin E (Alpha Tocopherol), amount=35.8, unit=mg, percentDv=179%), Entry(name=Beta Tocopherol, amount=0.6, unit=mg, percentDv=), Entry(name=Gamma Tocopherol, amount=1.2, unit=mg, percentDv=), Entry(name=Delta Tocopherol, amount=0.4, unit=mg, percentDv=), Entry(name=Vitamin K, amount=0.0, unit=mcg, percentDv=0%), Entry(name=Thiamin, amount=0.3, unit=mg, percentDv=19%), Entry(name=Riboflavin, amount=0.8, unit=mg, percentDv=48%), Entry(name=Niacin, amount=5.3, unit=mg, percentDv=27%), Entry(name=Vitamin B6, amount=0.2, unit=mg, percentDv=9%), Entry(name=Folate, amount=43.5, unit=mcg, percentDv=11%), Entry(name=Food Folate, amount=43.5, unit=mcg, percentDv=), Entry(name=Folic Acid, amount=0.0, unit=mcg, percentDv=), Entry(name=Dietary Folate Equivalents, amount=43.5, unit=mcg, percentDv=), Entry(name=Vitamin B12, amount=0.0, unit=mcg, percentDv=0%), Entry(name=Pantothenic Acid, amount=0.5, unit=mg, percentDv=5%), Entry(name=Choline, amount=75.5, unit=mg, percentDv=), Entry(name=Betaine, amount=~, unit=, percentDv=)])
(Minerals, [Entry(name=Calcium, amount=313, unit=mg, percentDv=31%), Entry(name=Iron, amount=5.4, unit=mg, percentDv=30%), Entry(name=Magnesium, amount=399, unit=mg, percentDv=100%), Entry(name=Phosphorus, amount=696, unit=mg, percentDv=70%), Entry(name=Potassium, amount=996, unit=mg, percentDv=28%), Entry(name=Sodium, amount=40.6, unit=mg, percentDv=2%), Entry(name=Zinc, amount=4.5, unit=mg, percentDv=30%), Entry(name=Copper, amount=1.7, unit=mg, percentDv=85%), Entry(name=Manganese, amount=3.2, unit=mg, percentDv=162%), Entry(name=Selenium, amount=4.1, unit=mcg, percentDv=6%), Entry(name=Fluoride, amount=~, unit=, percentDv=)])
(Sterols, [Entry(name=Cholesterol, amount=0.0, unit=mg, percentDv=0%), Entry(name=Phytosterols, amount=168, unit=mg, percentDv=), Entry(name=Campesterol, amount=8.7, unit=mg, percentDv=), Entry(name=Stigmasterol, amount=1.5, unit=mg, percentDv=), Entry(name=Beta-sitosterol, amount=158, unit=mg, percentDv=)])
(Other, [Entry(name=Alcohol, amount=0.0, unit=g, percentDv=), Entry(name=Water, amount=6.5, unit=g, percentDv=), Entry(name=Ash, amount=4.4, unit=g, percentDv=), Entry(name=Caffeine, amount=0.0, unit=mg, percentDv=), Entry(name=Theobromine, amount=0.0, unit=mg, percentDv=)])
christian-draeger commented 2 years ago

a bit refactored version / breaking things down into functions:

data class Entry(
    val name: String,
    val amount: String,
    val unit: String,
    val percentDv: String,
)

private val DocElement.categoryName: String
    get() = div {
        withAttribute = "align" to "center"
        findFirst {
            text
        }
    }

private fun DocElement.textOf(className: String) = div {
    withClass = className
    0 { text }
}

private val DocElement.entries: List<Entry>
    get() = div {
        withClass = "clearer"
        findAll {
            map { entry ->
                Entry(
                    name = entry.textOf("nf1"),
                    amount = entry.textOf("nf2"),
                    unit = entry.textOf("nf3"),
                    percentDv = entry.textOf("nf4"),
                )
            }
        }
    }

private fun getNuts(): Map<String, List<Entry>> = skrape(BrowserFetcher) {
    request {
        url = "https://nutritiondata.self.com/facts/nut-and-seed-products/3086/2"
        timeout = 20_000
    }
    response {
        htmlDocument {
            "#NutritionInformationSlide .m-t13" {
                findAll {
                    associate {
                        it.categoryName to it.entries
                    }
                }
            }
        }
    }
}

fun main() {
    getNuts().forEach(::println)
}
p4ulor commented 2 years ago

Wow! Thank you so much! I will analise your code and try it it out. I'm still pretty new to programming and new to website scrapping and this personal project is very important to me, and you helped me a lot! Have a nice week 💯 👍

p4ulor commented 2 years ago

I get a similar error as #163 What can I do?

E/AndroidRuntime: FATAL EXCEPTION: main
    Process: paulor.nutritiontrackerkotlin, PID: 17545
    java.lang.NoSuchFieldError: No static field INSTANCE of type Lorg/apache/http/conn/ssl/AllowAllHostnameVerifier; in class Lorg/apache/http/conn/ssl/AllowAllHostnameVerifier; or its superclasses (declaration of 'org.apache.http.conn.ssl.AllowAllHostnameVerifier' appears in /system/framework/framework.jar!classes2.dex)
        at org.apache.http.conn.ssl.SSLConnectionSocketFactory.<clinit>(SSLConnectionSocketFactory.java:151)
        at com.gargoylesoftware.htmlunit.httpclient.HtmlUnitSSLConnectionSocketFactory.buildSSLSocketFactory(HtmlUnitSSLConnectionSocketFactory.java:89)
        at com.gargoylesoftware.htmlunit.HttpWebConnection.configureHttpsScheme(HttpWebConnection.java:670)
        at com.gargoylesoftware.htmlunit.HttpWebConnection.createHttpClientBuilder(HttpWebConnection.java:586)
        at com.gargoylesoftware.htmlunit.HttpWebConnection.getHttpClientBuilder(HttpWebConnection.java:542)
        at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:172)
        at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1596)
        at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1518)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:493)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:413)
        at it.skrape.fetcher.BrowserFetcher.fetch(BrowserFetcher.kt:19)
        at it.skrape.fetcher.BrowserFetcher.fetch(BrowserFetcher.kt:10)
        at it.skrape.fetcher.FetcherConverter.fetch(Scraper.kt:30)
        at it.skrape.fetcher.Scraper.scrape(Scraper.kt:17)
        at it.skrape.fetcher.ScraperKt.response(Scraper.kt:87)
        at paulor.nutritiontrackerkotlin.NutritionTrackerFoodPullerKt$getNuts$1.invokeSuspend(NutritionTrackerFoodPuller.kt:51)
        at paulor.nutritiontrackerkotlin.NutritionTrackerFoodPullerKt$getNuts$1.invoke(Unknown Source:8)
        at paulor.nutritiontrackerkotlin.NutritionTrackerFoodPullerKt$getNuts$1.invoke(Unknown Source:4)
        at it.skrape.fetcher.ScraperKt$skrape$1.invokeSuspend(Scraper.kt:43)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
        at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:274)
        at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:85)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
        at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source:1)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
        at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source:1)
        at it.skrape.fetcher.ScraperKt.skrape(Scraper.kt:42)
        at paulor.nutritiontrackerkotlin.NutritionTrackerFoodPullerKt.getNuts(NutritionTrackerFoodPuller.kt:46)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel.getFood(MainActivity.kt:95)
        at paulor.nutritiontrackerkotlin.fragments.HomeFragment.onCreateView$lambda-1(HomeFragment.kt:26)
        at paulor.nutritiontrackerkotlin.fragments.HomeFragment.$r8$lambda$HtnxmDEYPhyQx_WnpruCX166m2A(Unknown Source:0)
        at paulor.nutritiontrackerkotlin.fragments.HomeFragment$$ExternalSyntheticLambda0.onClick(Unknown Source:2)
        at android.view.View.performClick(View.java:6294)
        at com.google.android.material.button.MaterialButton.performClick(MaterialButton.java:1131)
        at android.view.View$PerformClick.run(View.java:24770)
        at android.os.Handler.handleCallback(Handler.java:790)
        at android.os.Handler.dispatchMessage(Handler.java:99)
        at android.os.Looper.loop(Looper.java:164)
        at android.app.ActivityThread.main(ActivityThread.java:6494)
        at java.lang.reflect.Method.invoke(Native Method)
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:438)
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:807)
christian-draeger commented 2 years ago

This is a problem of html-unit library running on Android (which skrape{it} is internally using to implement the BrowserFetcher)

p4ulor commented 2 years ago

@christian-draeger this dude was able to solve it. You said that skrape it used html-unit, maybe if you use the new snap shot he mentioned maybe it fixes it https://github.com/HtmlUnit/htmlunit/issues/444#issuecomment-1045034649

christian-draeger commented 2 years ago

I will try tomorrow. If all tests will succeed we can ask @rbri to make it an actual release since for now it's just a snapshot version release. What means the implementation of this version could change at any time. Therefore we should wait for an official html unit release including the fix. I already asked what's the status of the snapshot / if we can expect an official release soon here :)

But as said, I will try to verify if the snapshot version in general will fix our problems here :)

Looking forward to be able to use the BrowserFetcher on Android soon 🎉

rbri commented 2 years ago

yes please give me a sign, will update the readme and make a release if it works for you

christian-draeger commented 2 years ago

https://github.com/HtmlUnit/htmlunit/issues/444#issuecomment-1045975262

p4ulor commented 2 years ago

Ok, so I added implementation("it.skrape:skrapeit:0-SNAPSHOT") The { isChanging = true } wasn't working the build said it didnt recognize It build succesfully On compiling, I have error "2 files found with path 'mozilla/public-suffix-list.txt'." So I added

packagingOptions {
        exclude 'mozilla/public-suffix-list.txt'
    }

But now I have error NutritionTracker\app\build\intermediates\merged_java_res\debug\base.jar: The process cannot access the file because it is being used by another process.

Then I invalidated caches and restart and then it worked. Then I had the network on main exception, so I added the suspend keyword and such, and then I got this error:

Error loading JavaScript from [https://a.mobify.com/nutritiondata/a.js].
    java.io.IOException: Unable to download JavaScript from 'https://a.mobify.com/nutritiondata/a.js' (status 404).
        at com.gargoylesoftware.htmlunit.html.HtmlPage.loadJavaScriptFromUrl(HtmlPage.java:1098)
        at com.gargoylesoftware.htmlunit.html.HtmlPage.loadExternalJavaScriptFile(HtmlPage.java:1017)
        at com.gargoylesoftware.htmlunit.html.ScriptElementSupport.executeScriptIfNeeded(ScriptElementSupport.java:196)
        at com.gargoylesoftware.htmlunit.html.ScriptElementSupport$1.execute(ScriptElementSupport.java:120)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.doProcessPostponedActions(JavaScriptEngine.java:1004)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:951)
        at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:582)
        at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:481)
        at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.callSecured(HtmlUnitContextFactory.java:349)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:834)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:810)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:801)
        at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScript(HtmlPage.java:957)
        at com.gargoylesoftware.htmlunit.html.ScriptElementSupport.executeInlineScriptIfNeeded(ScriptElementSupport.java:379)
        at com.gargoylesoftware.htmlunit.html.ScriptElementSupport.executeScriptIfNeeded(ScriptElementSupport.java:230)
        at com.gargoylesoftware.htmlunit.html.ScriptElementSupport$1.execute(ScriptElementSupport.java:120)
        at com.gargoylesoftware.htmlunit.html.ScriptElementSupport.onAllChildrenAddedToPage(ScriptElementSupport.java:143)
        at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:191)
        at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder.endElement(HtmlUnitNekoDOMBuilder.java:559)
        at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source:35)
        at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder.endElement(HtmlUnitNekoDOMBuilder.java:511)
        at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1247)
        at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1172)
        at net.sourceforge.htmlunit.cyberneko.filters.DefaultFilter.endElement(DefaultFilter.java:219)
        at net.sourceforge.htmlunit.cyberneko.filters.NamespaceBinder.endElement(NamespaceBinder.java:312)
        at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3189)
        at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2114)
        at net.sourceforge.htmlunit.cyberneko.HTMLScanner.scanDocument(HTMLScanner.java:937)
        at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:443)
        at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:394)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source:5)
        at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoDOMBuilder.parse(HtmlUnitNekoDOMBuilder.java:758)
        at com.gargoylesoftware.htmlunit.html.parser.neko.HtmlUnitNekoHtmlParser.parse(HtmlUnitNekoHtmlParser.java:204)
        at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:298)
        at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:218)
        at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:686)
        at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:588)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:506)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:413)
        at it.skrape.fetcher.BrowserFetcher.fetch(BrowserFetcher.kt:19)

but I think it was because the website was slow, so it magically disappeared when I increased the timeout to 40s or it was a coincidence idk, but even before the 40s, now I'm having the error bellow. #NutritionInformationSlide .m-t13 is that indicator of that key tag for the 8 nutritional tables I did a push to my repositor so you can check it out

I've been running the code in several ways and testing diferent stuff, the error bellow is now gone appearently, and the present error is the one above

E/AndroidRuntime: FATAL EXCEPTION: DefaultDispatcher-worker-1
    Process: paulor.nutritiontrackerkotlin, PID: 10318
    it.skrape.selects.ElementNotFoundException: Could not find element "#NutritionInformationSlide .m-t13"
        at it.skrape.selects.DomTreeElement.applySelector$html_parser(DomTreeElement.kt:93)
        at it.skrape.selects.CssSelector.applySelector$html_parser(CssSelector.kt:22)
        at it.skrape.selects.CssSelectable.findAll(CssSelectable.kt:36)
        at it.skrape.selects.CssSelectable.findAll(CssSelectable.kt:79)
        at it.skrape.selects.CssSelectable.findAll$default(CssSelectable.kt:78)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel$getNuts$2$2$1$1.invoke(MainActivity.kt:156)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel$getNuts$2$2$1$1.invoke(MainActivity.kt:155)
        at it.skrape.selects.CssSelectable.selection(CssSelectable.kt:15)
        at it.skrape.selects.CssSelectable.invoke(CssSelectable.kt:23)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel$getNuts$2$2$1.invoke(MainActivity.kt:155)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel$getNuts$2$2$1.invoke(MainActivity.kt:154)
        at it.skrape.core.ParserKt.htmlDocument(Parser.kt:120)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel$getNuts$2$2.invoke(MainActivity.kt:154)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel$getNuts$2$2.invoke(MainActivity.kt:153)
        at it.skrape.fetcher.ScraperKt.response(Scraper.kt:87)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel$getNuts$2.invokeSuspend(MainActivity.kt:153)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel$getNuts$2.invoke(Unknown Source:8)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel$getNuts$2.invoke(Unknown Source:4)
        at it.skrape.fetcher.ScraperKt$skrape$1.invokeSuspend(Scraper.kt:43)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
        at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:274)
        at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:85)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
        at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source:1)
        at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
        at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source:1)
        at it.skrape.fetcher.ScraperKt.skrape(Scraper.kt:42)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel.getNuts(MainActivity.kt:148)
        at paulor.nutritiontrackerkotlin.MainActivityViewModel$getFood$1.invokeSuspend(MainActivity.kt:169)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
        at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)

I also tried just using "clearer m-t13" for the Css selector search thing instead of "#NutritionInformationSlide .m-t13" and also didnt work

christian-draeger commented 2 years ago

https://github.com/HtmlUnit/htmlunit/issues/444#issuecomment-1045975262

OK but that are great news because it means the fetching and rendering of the site in general is working.

Now it's probably just about finding the correct css selectors. I can try to support at latest on Monday

i am waiting for https://github.com/HtmlUnit/htmlunit-android/issues/1 to make an official release including the fix to make BrowserFetcher work on android