ashbythorpe / selenider

Concise, Lazy and Reliable Wrapper for 'chromote' and 'selenium'
https://ashbythorpe.github.io/selenider/
Other
32 stars 2 forks source link

View selenider_elements gives unhelpful warning #24

Open artmg opened 4 months ago

artmg commented 4 months ago

In RStudio, when I try to view a selenider_element it behaves as expected. When I try to view a collection of selenider_elements from ss it does not.

> rstudioapi::versionInfo()$version
[1] ‘2024.4.2.764’
> Sys.getenv("R_PLATFORM")
[1] "aarch64-apple-darwin23.4.0"
> installed.packages()["selenider","Version"]
[1] "0.4.0"
> open_url(url)

> img_element <- s(xpath = '//*[@id="document"]/article/section[1]/div[3]/div[1]/img')
> img_elements <- ss(xpath = '//*[@id="document"]/article/section[1]/div[3]//img')
> class(img_element)
[1] "selenider_element"
> View(img_element)

> class(img_elements)
[1] "selenider_elements" "list"              
> View(img_elements)
Warning messages:
1: In `__OBJECT__`[["element"]] :
  `i` must be a whole number, not the string "element".
2: In `__OBJECT__`[["driver_id"]] :
  `i` must be a whole number, not the string "driver_id".
3: In `__OBJECT__`[["driver"]] :
  `i` must be a whole number, not the string "driver".
4: In `__OBJECT__`[["session"]] :
  `i` must be a whole number, not the string "session".

image

By contrast, another object collection: xml2's xml_nodeset is exposed in the viewer as I would expect

> content <- read_html(s(xpath = '//*[@id="document"]/article/section[1]/div[3]'))
> imgs <- content |> 
+     rvest::html_elements("img")
> 
> class(imgs)
[1] "xml_nodeset"
> View(imgs)

image

I'm only just learning about how S3 objects are created and as far as object collections in R, still coming to grips with use so no point me diving into the package code myself. Any ideas?

ashbythorpe commented 4 months ago

This might take a little while to explain.

First of all, a selenider_elements object is not a list, it's an object that acts like a list. This is because all selenider elements are lazy: they store the directions to an element rather than the element itself (see the README for why we do this). When we print or use the element(s), they are collected from the page.

This means we can't store collections as a list, since (among other reasons), we don't actually know how many elements there are. At one moment, there could be five elements that match your selection, but at another moment a sixth could be added to the page.

For this reason, we have a special selenider_elements object that stores an unknown number of elements. In most situations, you can just pretend this is a list. For example, you can subset it ([[ and [), which will return another lazy element or element collection. However, this means that the structure of the object is not an actual list, as you would expect.

Finally, the warning from View() is something I want to fix when I get the chance.

Hopefully this makes sense? Let me know if you don't understand anything.

Also, about xml_nodeset, the reason this one actually works is because it essentially is a list. Since it works on static HTML/XML, there's no danger of changing numbers of elements or anything like that. The only reason it has a different class is so methods like xml_find_all() work on element collections as well as elements. Generally, you shouldn't expect S3 objects that act like lists to necessarily be lists under the hood.

artmg commented 4 months ago

Thank you for your quick response and explanation of why selenider_elements is a dynamically-lengthed class, which now makes sense and I will accommodate for in my calling code. I have retitled this issue to focus on the small part you want to change, and I will reflect on any suggestions for clarifying the nature of the class in your package docs

artmg commented 4 months ago

I've added one mention, but before I PR doc_24 I'll see if I can find any other places to capture what you explained above.