Closed nfultz closed 3 years ago
This is awesome. It looks like this is all working. @nfultz is WIP just about documentation? I can merge in and add docs if so.
Yes, just docs and testing - enough time has passed between 3.5 / 3.6 and now that I'm not entirely confident that this works as intended with the older versions of dplyr . The other part I'm not sure about is if anything breaks when tibbles get used in new places - I think there's tests for the data flow but not for estimates / estimands.
Seeing a number of errors in the tests. e.g.:
declare_population(N = 10, latent = seq(0, 1, length.out = N)) +
declare_measurement(observed = as.numeric(cut(latent, breaks = seq(0, 1, length.out = 6), include.lowest = TRUE)))
Error: Error in step 2 ():
Error in fabricate_mode_error(): You must do exactly one of:
1) One or more level calls, with or without existing data
2) Import existing data and add new variables
3) Provide an `N` without importing data or creating levels
Most seem to be related to using fabricatr either in declare_population or declare_measurement.
Others are about not throwing declare time errors for things like: declare_diagnosands(keep_defaults = FALSE)
I don't see these errors on master now.
I've finished removing the remains of the strict-implicit-data-arg validation that was from before the great plus sign migration.
the official rules for handler argument matching are now something like:
If there are free variables on the handler, and .data/data is not named as a dot argument, and no dot's RHS is the data
symbol, then:
add symbol(data) as an argument to the dots in first position. If the handler has a formal argument named .data/data then set the name of the new argument to that.
Then we record that if we did this on the object, rematch the dots to the handler and copy names over, such that all further argument matching is by name and not position.
When evaluating a step, if we had added an implicit data arg named data, but the step is invoked with data=NULL, remove it. This is what makes fabricate(N, ...) work correctly. This should naturally occur in the first step of a design, but steps may also be called directly. dplyr-as-handler should fail anyway when used in first position without an existing data set, and so the corresponding dplyr error message will be displayed.
amazing. thank you @nfultz. merging in now and prepping for CRAN.
Moves all dot-mangling into rename_dots, with better logic for supporting dplyr style functions as step handlers.
Note: unfortunately,
handler=subset
now matches arguments correctly but still doesn't work because it uses base NSE rather than rlang quosures, so the predicate is evaluated in the declaration environment instead of on the provided data frame. There's no way of knowing if a UDF handler uses base or dplyr nse internally. This should probably be documented somewhere, perhaps a custom handlers vignette.closes #444