Open nickdickinson opened 1 month ago
@jamiewhths Forgot to mention this. I worked out a couple of options for an implementation of what we discussed.
Option 1 would be easy to implement BUT only if we stick to our existing varNames() implementation which then limits the expansion in cycles. For example, [supervisor].[employee].[supervisor] would not work because we do not revisit forms during the variable expansion. If we parse the variable names and perhaps do a little re-write on varNames() then we could expand arbitrarily into cycles and would require more effort initially to code.
Is there a way for us to do this with the help of the server? For example, send the '[Inhabitant].[Mother].[Full name]' to the server and get back the field id if it is correct? For a whole list of columns?
Option 2 would allow arbitrary expansion into cycles. A bit more effort as we'll be implementing the dplyr verbs but more standardized from an R / data science perspective.
Want to place this in a 4.39 milestone?
For the convenience of the user and to reduce the load on the server, provide an explicit grammar to allow the user to expand reference fields, parent records and sub-form records. The current automatic expansion would be used to implement these but reduce the load on the server by only showing what users explicitly want to see.
There are two potential approaches that can be implemented.
Approach 1: ActivityInfo label/code based selection of columns.
This would be easiest for ActivityInfo users who dabble in scripts. It would be a wrapper for
getRecords()
andselect()
andrename()
to allow immediately selecting variables in activityInfo style with the label and/or the code or id of each column and choose the resulting names.In pseudo R to get all inhabitants of households with a reference to a Person form for all person fields:
Approach 2: Tidyverse verbs
This is the best from an R developer / data science perspective. We would implement
unnest
verbs. These verbs includeunnest_wider()
for reference fields and parent records andunnest_longer()
for sub-form records and potentiallyunnest_auto()
to automatically choose the most appropriate and potentiallyhoist()
to be more specific in selection. The example below shows how one can use the tidy select functions to powerfully select exactly which variables are needed.In pseudo R to get all inhabitants of households with a reference to a Person form for all person fields:
Principles:
Combined
Both approaches are compatible and could be chained. It is mainly about exposing the most useful API.