bedatadriven / activityinfo-R

ActivityInfo R Language Client
https://www.activityinfo.org/support/docs/R/
17 stars 12 forks source link

Function to retrieve same columns as presented in the user interface #58

Closed akbertram closed 1 year ago

akbertram commented 1 year ago

The user interface uses a specific set of heuristics to produce a flat table from a form with references. In many cases, those using R want to refer to the same set of columns and column names.

We need a way to query this form specifically, perhaps a separate function.

Ryo-N7 commented 1 year ago

should this be included in the new getRecords() as an optional argument?

tbh i would rather have this quick fix in queryTable() first as this issue is more of a immediate need than the big evolutions that we're doing for getRecords()

nickdickinson commented 1 year ago

Ryo, which quick fix are your referring to for queryTable? #55? Or something else like the reference columns?

Ryo-N7 commented 1 year ago

this quick fix as in the one labelled in this issue >> have a function that grabs the table, with column names (including references) as exactly as we see it in the User Interface

Ryo-N7 commented 1 year ago

as shown by @nickdickinson today at our meeting, this is nearly done. just have to take out the [ ] and replace . with spaces so column names are exactly as the UI

nickdickinson commented 1 year ago

So far: Image

Image

There is the extra ID column for the reference field. Is that an issue? If so, I'll have to tell it to exclude those fields when getting the table exactly as the UI.

Also, I do not know how to request the record ids in case you would want to know those. It seems those are given as an @id column when giving an empty query for the columns but that otherwise that @id format does not work to request the record id...

Ryo-N7 commented 1 year ago

I forgot what you showed me but this does mean i can pass "CSO Name" and other arguments TO queryTable() and be able to grab these columns from the API, right?

ex. in form "blah", Regional ME Staff is a reference field pointing to the form "Regional ME Staff", the key field is set as ID. Therefore, in form "blah" in the UI, it shows up as "Regional ME Staff ID". I need to be able to query for "Regional ME Staff ID" and still get the proper field even though in the API an actual field named "Regional ME Staff ID" doesn't exist (it's just a representation of the "ID" field in "Regional ME Staff" in the UI).

queryTable(formId = "blah", columns = "Regional ME Staff ID") (or whatever)

nickdickinson commented 1 year ago

In the latest dev version 4.33

queryTable(form = fmTree) will give the columns as in my comment above and like the UI with the addition of the reference ID column. No need to specify them.

queryTable(form = fmTree$root) with the form ID will provide the API default with the record IDs.

I've created a new helper function uiColumns(x, select) help you to select using what you see in the UI:

queryTable(form = fmTree, columns = uiColumns(fmTree, select = "When were you born?"))

This is a quick workaround. I think the better solution in the long-run is the tidy-select pattern with lazy data frame. Something like:

getRecords(form, asUI = TRUE) %>% select() %>% pull()
nickdickinson commented 1 year ago

v4.33 is ready for testing @Ryo-N7. I've changed uiColumns() to prettyColumns() so that is the main difference at the moment. I'm working now on a new branch on the lazy df implementation. I've also reverted the behavior of queryTable to defaults but you can still configure it to not make.names and to use a tibble if you want. I will initially be using queryTable() anyway to implement pull() for getRecords().

queryTable(form = fmTree, columns = prettyColumns(fmTree, select = "When were you born?")) formSchemaFromData() is the function to create a new Schema. Here's an example:

formSchemaFromData(x = tibble(a = 1:5, b = factor(paste0(1:5, "_stuff")), a_logical_column = 1:5==4, date_col = (seq(as.Date("2021-07-06"),as.Date("2021-07-10"),by = 1))), databaseId = "Some database", label = "My new form schema!!", keyColumns = "b", logicalAsSingleSelect = FALSE)

Ryo-N7 commented 1 year ago

great, thanks I'll get to work then

Ryo-N7 commented 1 year ago

found a small typo: Line 321 in forms.R

stopifnot("logicalAsSingelSelect must be TRUE or FALSE" = is.logical(logicalAsSingleSelect))

Ryo-N7 commented 1 year ago

i'm getting some errors when i try to use addForm(0 with the form schema object i created with formSchemaFromData():

larlar <- tibble(a = 1:5, b = factor(paste0(1:5, "_stuff")), a_logical_column = 1:5==4, date_col = (seq(as.Date("2021-07-06"),as.Date("2021-07-10"),by = 1)))

larlarschm <- formSchemaFromData(
  x = larlar, 
  databaseId = "c34mde8ldvep9q515", label = "My new form schema!!", 
  keyColumns = "b", 
  logicalAsSingleSelect = FALSE) 

upform_res1 <- addForm(larlarschm$schema)

formupdate

nickdickinson commented 1 year ago

Thanks for testing @Ryo-N7. I think I was in a bit of a rush so I'll try this out and add a test or two to the branch you are on.

nickdickinson commented 1 year ago

@Ryo-N7 I've updated branch v4.33 with the fixes to formSchemaFromData() which now has a working test in place. There were a few quirks here and there (for example no support for factors in importTable) but I think I've addressed all I could find. Hopefully you can find some time to try it out again. FYI, I also merged some very basic implementation of getRecords(). Right now it works to look at the table and collect() it. But much of scaffolding is there. By the end of the week, we should have a simple version of selecting, slicing and sorting in place. Possibly some filters but still in the activityinfo style.

Ryo-N7 commented 1 year ago

hey @nickdickinson getting this error on install:

Error: object 'tidyselect_data_has_predicates' is not exported by 'namespace:tidyselect'

installerr

nickdickinson commented 1 year ago

That's weird, I cannot replicate it.... maybe try to update tidyselect?

Ryo-N7 commented 1 year ago

ah great, that did it ... thanks!

Ryo-N7 commented 1 year ago

yup , can confirm it works for the simple use case:

newf-succ2 newf-succ

will now spend some time now to play around with it a bit more using more complicated spreadsheets // data.frames

nickdickinson commented 1 year ago

queryTable now supports queryTable(fmTree, columns = prettyColumns(fmTree, select=c("Column1", "Column2")))

If you want to getRecords(fmTree, style = prettyColumnStyle()) then you can use the dplyr select verb for renaming, etc. and then collect(), which may be a bit more concise.

Ryo-N7 commented 1 year ago

hey, i think switching over to the new columnStyle() functions sort of muddled things a bit:

this is what the labels look like in the UI for a table i want to grab: actual-ui


this:

  res_all <- getRecords(form = table_id, style = prettyColumnStyle()) %>% 
    collect()

gives me this:

res-all

and this:

  res_labels <- getRecords(form = table_id, style = allColumnStyle(columnNames = "label")) %>% 
    collect()

gives me this:

res-labels

finally, this gives me:

  res_ids <- getRecords(form = table_id, style = allColumnStyle(columnNames = "id")) %>% 
    collect()

this: res-ids

2nd one is supposed to be what it should be but the . between the words // replacing the blank spaces -- have been added back in so i can't match by the exact UI column name which doesn't have the .

unless i need to add that make.names argument but that's only in queryTable() and i can't find it in the new getRecords() + columnStyle() functions

nickdickinson commented 1 year ago

I will need to create a test case to see what is happening. It looks like registry_participantis becoming participant_name..... is one the label of the reference table and one the reference field label?

The second and first one are actually identical just a . has been put in instead of a space I think.

Ryo-N7 commented 1 year ago

yeah registry_participant participant_name and the DOB one are both reference fields, coming from the registry_participant table. the key fields set are the participant_name and DOB.

the label of the field in the form itself is also participant_name so maybe that's why it's getting confused?

part-name-lab

in the parent form: partname-parent

nickdickinson commented 1 year ago

Thanks, that is super helpful. I'll work out a test case and fix that.

nickdickinson commented 1 year ago

@akbertram @Ryo-N7 Let's talk about this issue. I've not been able to figure out when the user web interface chooses to use the form field label and when it chooses to use the referenced form label. It seems to use the first more often but sometimes when I add a referenced field, it then uses the form field label. What is the logic?