AtlasOfLivingAustralia / galah-R

Query living atlases from R
https://galah.ala.org.au
39 stars 3 forks source link

[BUG] galah_select(group = "assertions") causes error #137

Closed stewartmacdonald closed 1 year ago

stewartmacdonald commented 2 years ago

Describe the bug In an effort to debug this issue, I seem to have stuffed something up because I can no longer download any records with galah. Thus, this bug report is going from memory.

Using the example code works fine:

result <- galah_call() |>
    galah_identify("Litoria") |>
    galah_filter(year >= 2020, cl22 == "Tasmania") |>
    galah_select(basisOfRecord, group = "basic") |>
    atlas_occurrences()

But changing this to group = 'assertions' causes an error about the $ operator being invalid for vectors.

Galah 1.4.0

To Reproduce Steps to reproduce the behaviour:

> library('galah')
> galah_config(email = "email@example.com")

> result <- galah_call() |>
    galah_identify("Litoria") |>
    galah_filter(year >= 2020, cl22 == "Tasmania") |>
    galah_select(basisOfRecord, group = "assertions") |>
    atlas_occurrences()
Error: $ operator is invalid for atomic vectors
In addition: Warning message:
We didn't detect a field to search for.
ℹ Try entering text to search for matching fields.
ℹ To see all valid fields, use `show_all_fields()`. 

Expected behaviour I expect to get a list of records along with their associated assertion data.

Additional context This bug appears to come from the preset_cols() helper function in galah_select.R. Changing this line:

"assertions" = search_fields(type = "assertions")$id)

to:

"assertions" = show_all_fields(type = "assertions")$id)

seems to fix the issue. I have submitted pull request https://github.com/AtlasOfLivingAustralia/galah/pull/135 that fixes this bug and expands on the documentation, but I have no idea what I'm doing.

daxkellie commented 1 year ago

It looks like there is still an issue with downloads when using galah_select(group = "assertions")

As an example, this query returns an error:

library(galah)
library(magrittr)

galah_config(email = "your_email_here")

galah_call() %>% 
  galah_identify("animalia") %>% 
  galah_identify("https://biodiversity.org.au/afd/taxa/3cbb537e-ab39-4d85-864e-76cd6b6d6572", search = FALSE) %>% 
  galah_filter(basisOfRecord == "PRESERVED_SPECIMEN", 
               year == 2022) %>%  # Limiting to 2022 for now
  galah_select(group = "assertions") %>%  
  atlas_occurrences() 
#> Calling the API failed for `atlas_occurrences`.
#> ℹ This might mean that the selected system is down. Double check that your query is correct.
#> ℹ If you continue to see this message, please email support@ala.org.au.
#> # A tibble: 0 × 0

Created on 2022-10-31 with reprex v2.0.2

mjwestgate commented 1 year ago

This looks like a bug in the code for passing assertions to atlas_occurrences, or specifically, inside the subfunction build_assertion_columns (stored in utilities_internal.R), which looks like this:

build_assertion_columns <- function(col_df) {
  if (nrow(col_df) == 0) {
    return("none")
    # all assertions have been selected
  } else if (nrow(col_df) == 107) {
    return("includeall")
  }
  paste0(col_df$name, collapse = ",")
}

This function checks whether ‘all’ assertions are requested by testing whether 107 rows have been passed; but since that was written, the ALA has updated their assertions, and there are now 116. Therefore atlas_occurrences concatenates all assertion names into the field arg, instead of simply passing &qa=includeall, which is the preferred solution.

A better solution is to record which groups are passed by galah_select (possibly using call attribute?) and detecting that within atlas_occurrences.

mjwestgate commented 1 year ago

fixed as of version 1.5.1