Open GuiFabre opened 1 year ago
documents updated from test for CaG on Canpath server in folder #To test avant-apres# block on same variable I think
the line 83 in the DPE has an error. In input_variables, the variable SY09Bis_SQ01
does not exists. So the code fails.
update : the errors are catched, so the code can keep processing. It does not say what the error is, but passes through it.
(following discussion on variable that were "impossible" but that failed) This is resolved after updating to the version of last Friday. works well right now
batch 1 of improvements:
@a-trottier
Adding the column -> OK
Separate by type of harmo_rule -> must be user end. Some might separate by dataset, other by dataschema variable, other by rule, or harmo status (to check the one that has failed). If you want to separate them :
OPTION 1 : use avant_apres_output %>% group_split(harmo_rule)
or filter your data processing element on a specific rule, or do a lapply, sapply, walk accross a grouped data processing elements.
OPTION 2 : add an additional parameter split_by
, which can be either NULL or dataset by default (because the input is a list of dataset. That allows flexibility. If option 2, we must decide if each of the tibble has the same structure or not.
In a nutshell, the tibble generated has column names which depend on the input provided. Additional columns are created after all of the variables are analysed. Hence, the output is different for direct_mapping variables only or case_when (involving more than one input variable). We must decide if the output is different for each group. The underlying function will be a
tbl %>% group_split(var) %>% lapply(function)
## each tibble is (possibly) different, with dedicated columns. bind_rows (possibly) does not work.
or if it is the same. The underlying function will be
tbl %>% function %>% group_split(var)
## each tibble is the same, with (possibly) useless columns. bind_row() works.
Add label when available (or different from name actually) -> OK
Then we have another question pending: Does it makes sense to compare datasets with this tool?
When we talked about it first it was to compare datasets not dossiers. but it is a good question. To discuss
testing of a function:
Run tests until consensus. Change life cycle if needed ('stable')