inbo / camtrapdp

R package to read and manipulate Camera Trap Data Packages (Camtrap DP)
https://inbo.github.io/camtrapdp/
Other
3 stars 0 forks source link

Should `read_camtrapdp()` also read custom/non tabular data to `x$data`? #90

Open PietrH opened 5 days ago

PietrH commented 5 days ago

While working on the print method I came across the fact that there are less x$data objects than x$resources for example_dataset():

example_dataset() currently includes one non standard resource: individuals, currently this resource is not read in by read_camtrapdp() at all.

This is certainly possible because it's a simple json array, very quick and dirty:

library(camtrapdp)
get_custom_resources <- function(x) {
  resources <- frictionless::resources(x)
  custom_resources <- resources[!resources %in% names(purrr::pluck(x, "data"))]
  x$resources[purrr::map_lgl(
    purrr::pluck(x, "resources"),
    ~ purrr::pluck(.x, "name") == custom_resources
  )] %>%
    purrr::map(~ purrr::pluck(.x, "data")) %>%
    purrr::set_names(custom_resources)
}
get_custom_resources(example_dataset())
#> $individuals
#> $individuals[[1]]
#> $individuals[[1]]$id
#> [1] 1
#> 
#> $individuals[[1]]$individualName
#> [1] "Reinaert"
#> 
#> $individuals[[1]]$scientificName
#> [1] "Vulpes vulpes"

Created on 2024-07-03 with reprex v2.1.0

We'd also have to account for custom tabular data I suppose?

Is this something we want to support? It certainly feels convenient, but might be a hassle to maintain because it seems quite open ended. Opinions?

peterdesmet commented 4 days ago

I would't read custom resources:

  1. Talking with the InsectAI people, they might want to include several (big) extra resources, which could really bog down read_camtrapd() if they were all loaded all the time. I think we should stick with the camtrapdp official resources.
  2. Extra resources can be loaded by the user on request with frictionless::read_resource()
PietrH commented 4 days ago

Alrighty! I would suggest making note of this behaviour (we don't read everything, only default tables) in the documentation of read_camtrapdp()

peterdesmet commented 4 days ago

Ok, please do. 👍

PietrH commented 4 days ago

@damianooldoni We discussed the read_camtrapdp() behaviour here

PietrH commented 4 days ago

We were talking about it over lunch, is reading the other resources in a lazy way, or via promises perhaps?

peterdesmet commented 4 days ago

I still consider reading these out of scope. The user of camtrapdp doesn't know data are attached to x$data, so having data attached there isn't necessarily useful. I would inform of the additional resources in print() (working on updating the PR) and indicate that those can be loaded with frictionless::read_resource().

PietrH commented 4 days ago

frictionless::read_resource() currently only supports tabular resources

peterdesmet commented 4 days ago

Indeed, thus not something we control (and should implement in) camtrapdp. Informing additional resources exist is sufficient.