Closed adknudson closed 2 months ago
I intentionally don't provide a c()
method in TidierData and agree that either Julia vectors or tuples are preferred.
I haven't looked too closely yet, but if we use c()
in arguments that are immutable, then my suggestion would be to move to tuples ()
rather than arrays because tuples are stack-allocated and thus faster.
I do provide a c()
method in TidierPlots, because I essentially want people to be able to copy and paste R ggplot
code and have it work (after changing aes
to @aes
, but there's no way around that). It's a pretty easy way to adhere to the stated goal 1 of the package, so I feel pretty strongly that it should be left un-deprecated.
There's nothing stopping anyone from using a vector or tuple instead though if they want, since c()
essentially just creates a vector of its arguments!
I would be fine with changing c()
to return a tuple, but I can't imagine the speedup is detectable for the applications it's used for (e.g. scale_color_manual(values = c("red", "blue", "white"))
)
It wouldn't be for speed so much as to make it closer to (Julia) style, since c()
is more of an R thing rather than a tidyverse thing.
As long as vectors/tuples work and are documented, I think that's fine (if you prefer to leave c()
as an option).
But in TidierData, I just use tuples because visually they look like R minus the c
letter.
It doesn't seem like c()
is used anywhere in the source code, just in docs and exported as a convenience method. I make the argument that providing this convenience method can cause more issues for end users than it solves. If only TidierPlots has c()
, then what happens when users try other TidierOrg packages and get confused when c()
isn't found? Either all Tidier packages must export c()
or none of them should.
It's a pretty easy way to adhere to the stated goal 1 of the package, so I feel pretty strongly that it should be left un-deprecated.
I don't think keeping c()
really aids in goal 1 too much. Users coming from R to Julia will already be needing to adjust to Julia syntax. Goal 1 feels more about keeping ggplot2
syntax and tidyverse
verbs than keeping R syntax.
As a compromise, if c()
is kept around then it should give a warning along the lines of:
"c()
is an R-specific method. The Julia equivalent is to create a vector ([1, 2, 3]
) or to use vcat
to join multiple vectors (vcat([1,2,3], [4,5,6])
), or join multiple vectors using a semicolon ([1:3; 4:6]
). "
Lastly if c()
is to be kept around, it should behave just like it does in R
The current implementation of c()
is incorrect.
In R:
> c(1, "2", 3, c(4, 5, 6))
[1] "1" "2" "3" "4" "5" "6"
In Julia:
julia> c(1, "2", 3, c(4, 5, 6))
4-element Vector{Any}:
1
"2"
3
[4, 5, 6]
I didn't mean for goal one to only apply to verbs or to specific parts of the code - I really meant what it says: "stick as closely to tidyverse syntax and behaviour as possible." To me, that means - especially if it is easy to implement, as it is in this case - that ggplot code written in R should run in TidierPlots.
I'm on board with adding the note about julia vector creation though, that seems in line with the goal of introducing R users to julia.
We can edit c
to support those use cases, but they are unlikely to come up in correctly specified ggplot
code so I haven't made it a priority
The c()
function can be simply changed to
function c(args...)
@warn "don't use 'c(1, 2, 3)', use [1, 2, 3]" maxlog=2
return reduce(vcat, args)
end
which will at least solve the second issue of correctness
c
has been updated - I will leave the question of whether c
should or should not be supported outside of TidierPlots as well to @kdpsingh
Thanks -- I'll think more on this but my inclination for now is to leave things as-is. I'm open to feedback though.
I do want to avoid a TidierBase package for now because ideally each package in the Tidier family should be useful as a standalone package (unless having TidierBase helps reduce re-compilation of re-exported macros like @chain
).
I understand that the point of Tidier.jl is to match the tidyverse in Julia, but should
c()
really be a method that is provided? It's not necessarily a tidyverse-specific method, and it feels superfluous when a user can just create a vector or tuple, or use one of the other methods likecollect
,cat
, andvcat
. At worst it feels anti-Julian, and at best it should be a common method provided by aTidierBase
package and re-exported by theTidier
packages.