Closed kdpsingh closed 1 year ago
All of the issues have above have been addressed in 2e3b5cbb93943b98fd98da58fb1721be7d0e7280. The newly create parsing functions all follow a parse_*
pattern to indicate that they take in expressions and return modified expressions.
The original intent of the
@autovec
macro was to handle the auto-vectorization of function calls based on specific heuristics. Currently, it does a lot more than that and needs to be split into multiple functions.Rather than break this into multiple smaller issues, I'm going to list this as a large issue because the following things to need to happen together.
Here are the functions I need to implement that breaks up
@autovec
:[x] autovec: a function that walks the AST and vectorizes functions using the same heuristics as currently used. Note that operators are vectorized slightly differently than non-operators.
[x] across_helper: a function that walks the AST and implements the across() functionality when used inside of other macros.
[x] desc_helper: similar to the across_helper but for desc().
[x] parse_tidy: a function that parses the R-style tidyverse "non-standard evaluation" expressions and converts them into valid quoted Julia DataFrames.jl expressions.
This last one is particularly important because the existing functionality is implemented using a mixed style of expression interpolation and string parsing. Some level of string parsing may continue to be useful but we should avoid it except where it improves readability. The string parsing currently causes some parsing errors for edge cases due to issues with regular expressions.
Another thing to consider when writing the
parse_tidy
function is that there is code that currently parses a vector of strings and evaluates them within the scope of the module. This is problematic because it means that functions have to be within the scope of the module for them to be usable. This is why we need to includeusing Statistics
within the module. This needs to be replaced with expression interpolation so that no code evaluation takes place until it's returned by the macro into the calling environment.Lastly, as a reminder to myself, both the new autovec() and parse_tidy() functions should operate on a single expression. They will be then be vectorized using broadcasting.