AtlasOfLivingAustralia / galah-R

Query living atlases from R
https://galah.ala.org.au
39 stars 3 forks source link

No option exists to check whether a value exists within a specified field #131

Closed mjwestgate closed 1 year ago

mjwestgate commented 2 years ago

In v.1.4.0, we have show_all_fields() to show which fields are valid, and search_fields() to run a search within show_all_fields (via grepl). This is also used internally when galah_config(run_checks = TRUE) to ensure that user-supplied field names exist in the specified atlas. However, there is no equivalent for values within a field.

Specifically, search_field_values() is conceptually closer to show_all_fields than search_fields, because it returns all possible values for the specified field. Then the user must build their own search using the resulting tibble to check if the specified value is present.

Potentially improved behavior is:

A further benefit of this approach is that we could support value checking when galah_config(run_checks = TRUE) using show_all_values, in the same way as we currently use show_all_fields.

As an aside, default behaviour for atlas_counts is that setting limit = NULL returns all values, but this doesn't work for search_field_values.

mjwestgate commented 2 years ago

Decision 2022-02-08 to action this, with a caveat that the whole search_ and show_all_ syntax is a little clumsy and may need further thought.

mjwestgate commented 1 year ago

On reflection, there is a further problem here. show_all_values is actually nested wishing show_all_fields, because show_all_values requires a field argument to return a sensible result. But this nestedness is obscured by giving both functions the same prefix. A similar problem occurs with show_all_profiles and search_profile_attributes. Species lists aren't implemented yet, but have the same issue again.

A better choice would be to separate out these 'nested' lookup functions into a new set of functions. Preliminary choice is:

daxkellie commented 1 year ago

Further discussion has led to a possible solution of show_values() and search_values(), with the above functions set as internal to to these 2 values look-up functions. These can be piped from a search result of an accepted information type. For example:

search_all(fields, "cl22") |> show_values()
search_all(fields, "cl22") |> search_values("tasmania")
mjwestgate commented 1 year ago

fixed with addition of show_values and search_values