Open Enchufa2 opened 5 years ago
ggplot2 is an elegant system for professional graphics. But it has a number of features that are at odds with the overall tidyverse philosophy (and Hadley has publicly acknowledged these). I'd suggest noting that ggplot2 takes tidy data as input (though lattice and base graphics do as well).
If one takes the definition of "tidy" to mean "row/colum" data frames, then 99% of R is "tidy." The term then becomes meaningless. The ggplot2 package is no more "tidy" than is lm().
I find myself constantly tidying and untidying data from modelling to visualisation and back to modelling again, because many modelling functions need all the features in columns (the model matrix), but ggplot2 needs many of them folded in long format, in order to be assigned to a layer. That's especially true for factors. The lm
interface is pretty tidy in that sense, yes, but many are not.
Thanks! Norm
On Wed, Jul 10, 2019, 10:57 AM Iñaki Ucar notifications@github.com wrote:
I find myself constantly tidying and untidying data from modelling to visualisation and back to modelling again, because many modelling functions need all the features in columns (the model matrix), but ggplot2 needs many of them folded in long format, in order to be assigned to a layer. That's especially true for factors. The lm interface is pretty tidy in that sense, yes, but many are not.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/matloff/TidyverseSkeptic/issues/7?email_source=notifications&email_token=ABZ34ZKZQ5K6DYGAJNRW4TDP6WP6PA5CNFSM4H7DW5A2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZSZTVA#issuecomment-509974996, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZ34ZLUA7YBXD4PL4VMN43P6WP6PANCNFSM4H7DW5AQ .
@Enchufa2
Is there need for tidying and untidying? This example below could result in modeling and plotting at the same time. Data format remains unchanged:
dt = as.data.table(iris)
lapply(
list('loess', 'glm', 'lm'),
function(i) {
dt[, ggplot(.SD, aes(Petal.Length, Sepal.Length)) +
geom_point() +
geom_smooth(aes(color = Species), method = i)]
}
)
Just one comment about these statements about
ggplot2
:I don't think it's thematically unrelated, I do think it follows the philosophy. First of all,
ggplot2
was designed to receive the input in (Hadley's) tidy form, even before it was called tidy. I believe this fact shaped the idea of tidy data, which culminated in Hadley's Tidy Data paper (JSS 2014), and that was in fact the seed for the Tidyverse.