avant_apres_harmo() - Githubissues

GuiFabre commented 1 year ago

testing of a function:

[ ] run the function :
- [x] with demo examples (common usage)
- [ ] with demo examples (bugs scenarios - make errors combo with each parameter)
- [ ] with real data
- [ ] try different combinations (common usage, other usages, on purpose errors, handle exceptions) (if possible)
[ ] validation of the function component
- [ ] validate/propose name of the function
- [ ] validate/propose parameters names (need more/less params?)
- [ ] validate/propose output generated
[ ] generate documentation (mandatory in bold)
- [ ] title
- [ ] description
- [ ] details
  - [ ] each parameter has it owns detail
- [ ] param
- [ ] returns
- [ ] seealso
- [ ] examples
[ ] include in the package
- [ ] add life cycle informations (if needed, 'experimental') and update lifecycles of others ('superseded', 'deprecated')
- [ ] test for CRAN
- [ ] add comment in NEWS.Rmd
- [ ] change website (if needed : references, vignette(s), main page...)
[ ] communicate function availability

Run tests until consensus. Change life cycle if needed ('stable')

a-trottier commented 11 months ago

documents updated from test for CaG on Canpath server in folder #To test avant-apres# block on same variable I think

GuiFabre commented 11 months ago

the line 83 in the DPE has an error. In input_variables, the variable SY09Bis_SQ01 does not exists. So the code fails.

update : the errors are catched, so the code can keep processing. It does not say what the error is, but passes through it.

a-trottier commented 11 months ago

(following discussion on variable that were "impossible" but that failed) This is resolved after updating to the version of last Friday. works well right now

a-trottier commented 7 months ago

batch 1 of improvements:

[ ] adding a column with the harmo_rule
[ ] separate by type of harmo_rule output is a list of tibbles: recode, case_when, operation, other, and all the others together (when present because no point having empty tab for operation if no harmo_rule of this type)
[ ] adding a parameter to add label to the value (when available) ideally something like "1 = male" instead of just "1" or just "female"

GuiFabre commented 7 months ago

@a-trottier

Adding the column -> OK

Separate by type of harmo_rule -> must be user end. Some might separate by dataset, other by dataschema variable, other by rule, or harmo status (to check the one that has failed). If you want to separate them : OPTION 1 : use avant_apres_output %>% group_split(harmo_rule) or filter your data processing element on a specific rule, or do a lapply, sapply, walk accross a grouped data processing elements. OPTION 2 : add an additional parameter split_by, which can be either NULL or dataset by default (because the input is a list of dataset. That allows flexibility. If option 2, we must decide if each of the tibble has the same structure or not.

In a nutshell, the tibble generated has column names which depend on the input provided. Additional columns are created after all of the variables are analysed. Hence, the output is different for direct_mapping variables only or case_when (involving more than one input variable). We must decide if the output is different for each group. The underlying function will be a

tbl %>% group_split(var) %>% lapply(function)
## each tibble is (possibly) different, with dedicated columns. bind_rows (possibly) does not work.

or if it is the same. The underlying function will be

tbl %>% function %>% group_split(var) 
## each tibble is the same, with (possibly) useless columns. bind_row() works.

Add label when available (or different from name actually) -> OK

a-trottier commented 7 months ago

Then we have another question pending: Does it makes sense to compare datasets with this tool?

When we talked about it first it was to compare datasets not dossiers. but it is a good question. To discuss

maelstrom-research / tests-functions-dev

avant_apres_harmo() #1