huizezhang-sherry / cubble

A tidy structure for spatio-temporal vector data
https://huizezhang-sherry.github.io/cubble/
Other
55 stars 9 forks source link

Suggested improvements for add_missing_prct() #6

Closed mitchelloharawild closed 1 year ago

mitchelloharawild commented 2 years ago

https://github.com/huizezhang-sherry/cubble/blob/e97069c9eab34a80b9ae0ff9009140fecb659a9e/R/missing.R#L19

This code assumes that ... is of length 1 (since expr() only accepts one argument). Instead, you can introduce a new argument for the function cols which supports tidyselect (much like tidyr::pivot_longer() and others), then capture the inputted expression with rlang::enexpr(cols).

Alternatively, you could probably directly use tidyselect::eval_select({{cols}}, face_temporal(data)). Note that I've removed the pipe as it complicates the evaluation environment and is a slight slow down.

https://github.com/huizezhang-sherry/cubble/blob/e97069c9eab34a80b9ae0ff9009140fecb659a9e/R/missing.R#L31 Using quo() here is capturing the environment of your mapped function, and the contents of the add_missing_prct() environment. Since you just want to evaluate elements from your data, it is simpler to use expr() here.

https://github.com/huizezhang-sherry/cubble/blob/e97069c9eab34a80b9ae0ff9009140fecb659a9e/R/missing.R#L33 Since you have a list of expressions (or quosures) you can apply them across multiple function arguments with !!!, avoiding the conflict with variable name missing and the need to unnest_wider().

mitchelloharawild commented 2 years ago

Oh, just realised on re-reading this why you needed quo() for your expressions. Since your code is using .x inside, you need to evaluate it within the mapped function's environment.

Instead, it is neater to replace the symbol .x with it's value. You can do that with !!.x, like:

calls <- map(names(vars),  ~ expr(sum(is.na(ts[[!!.x]]))/length(ts[[!!.x]]))) 

Also you can use mean() instead of sum(x)/length(x) to simplify things a bit.

huizezhang-sherry commented 2 years ago

Thanks Mitch, valuable suggestions!

I realise I use tidyselect::eval_select() to allow selection by dt %>% add_missing_prct(var = start:end). But agree with the rest of the comments you made :)