skrapeit / skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
https://docs.skrape.it
MIT License
805 stars 59 forks source link

[FEATURE] Access to parent element #132

Closed NusretOzates closed 3 years ago

NusretOzates commented 3 years ago

Is your feature request related to a problem? Please describe. I am trying to extract the main content in an HTML file. To do that I need to find all "p" tags and their parent elements. Currently, it is not possible due to DocElement class's "element" variable is private.

Describe the solution you'd like Making the element tag not private could solve the problem but I guess that brokes the style of this library because the 'element' is a jsoup object.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

image

Thanks for the library, using it feels so cool! 👍

christian-draeger commented 3 years ago

hey sorry for late response. i will check today if it is possible somehow already. if not i will add a feature to access parents :)

NusretOzates commented 3 years ago

Thanks a lot :)

christian-draeger commented 3 years ago

hey @NusretOzates i added a some stuff to get parent, children and siblings of an element. usage can be seen here: 855560b i will update the readme soon and when i managed to solve #123 it will be published to maven central. thx for the input / pointing out of this missing feature :)

FYI: you should be able to call element on any DocElement to convert it back to a jsoup Element. but you will lose the DSL than or have to wrap it with a DocElement again. code will not look as smooth as pure skrape{it} DSL but should work :)

NusretOzates commented 3 years ago

Looks great thanks a lot!