TidierOrg / Tidier.jl

Meta-package for data analysis in Julia, modeled after the R tidyverse.
MIT License
519 stars 14 forks source link

Tidier.jl roadmap #81

Open kdpsingh opened 1 year ago

kdpsingh commented 1 year ago

I thought it would be useful to have a pinned issue that summarizes which function/macros we are working on building out, and which tidyverse (or related) package they comes from. This doesn’t represent the entire set of functions we need to capture but is intended to give a sense of direction for this project.

Note one key difference in function names below between Tidier.jl and tidyverse: Tidier.jl functions relating to data types are named after the Julia types and not after R types. This is because the data types aren't consistent across languages, and Julia allows for more granularity than R. For example, we plan to use as_string() in Tidier.jl rather than as_character() because strings are collections of characters in Julia.

dplyr

tidylog

forcats

We are using the categorical type from CategoricalArrays.jl, so functions will be prefixed with cat_* instead of fct_*.

ggplot2

lubridate

Developer resources

Strategic decisions to revisit

pdimens commented 1 year ago

The AOG/CairoMakie deps will be a very appreciable increase in TTFX. I think it makes sense to follow the metapackage format. A separate org would be useful for organization and democratization, which would have a lot of value to the growing number of contributors.

I like the idea of Tidier.jl reexporting TidyData.jl and TidyPlots.jl (and whatever else). Notably, it makes more sense to me, linguistically, for the sub-packages to be Tidy_.jl and not Tidier.jl. My rationale is that tidydata and tidyplots would make your workflow tidier. And yes, I'd offer to make the logos for those ☺️

kdpsingh commented 1 year ago

Thanks @pdimens. My rationale for TidierData instead of TidyData was that if we set up a Tidier org, then it makes more sense to me for the connected packages to have a Tidier prefix.

I was curious to see how Julia 1.9 will impact TTFX for plotting. I know including plotting packages will increase install time by quite a bit, but I'm hopeful that TTFX may be okay with Julia 1.9 since I think Makie uses SnoopPrecompile.

And yes, absolutely would love your magic touch on logos 🤩!

kdpsingh commented 1 year ago

As a side note, the name "Tidier" is already taken as a GitHub organization. Considering "TidierOrg" as an alternative or "TidierJulia". Or may reach out to the current "Tidier" org to see if they are willing to give up the name.

ViralBShah commented 1 year ago

Just a note - Usually Julia orgs on github have Julia as the prefix. Not that you have to do that too, but JuliaTidier sounds better to my ear than TidierJulia.

kdpsingh commented 1 year ago

Thanks @ViralBShah. Appreciate the note! I was only thinking TidierJulia bc "tidier" is an adjective (as compared to JuliaData or JuliaPlots).

Am leaning towards just Tidier (if I can get permission from existing owner) or TidierOrg (similar to MakieOrg).

If this repo is transferred to an org, is there anything special I need to do when I register it?

No urgency and happy to cross that bridge when we get there.

ViralBShah commented 1 year ago

Github will forward the old URL - but we should change the URL in the General registry after the transition.

pdimens commented 1 year ago

@kdpsingh I should have time this week to draft up some icons for the new packages (Dates, Cats, Strings). The big question is, would you like the background color of the icons to follow the same blue, or are you interested in different colors?

kdpsingh commented 1 year ago

Thanks @pdimens! My thinking was that since TidierPlots already has the same shade of blue, we should just keep the same blue for all the logos so that they are instantly recognizable as being part of Tidier. I'm open to different colors as well. What do you think?

Also, 2 other things would be really helpful if you have the bandwidth.

  1. Can we create a new TidierData.jl logo that is identical to the current Tidier.jl logo except with the text reading as TidierData.jl?

  2. Since Tidier.jl will be transformed into a meta-package that will include plotting, we can add scatterplot points to the Tidier.jl logo so that it combines data analysis and plotting? (Kind of like the TidierOrg logo)

Appreciate your creativity and time, and totally open to your ideas!

pdimens commented 1 year ago
  1. Not a problem to update tidier/data.
  2. I can keep the blue, or I can retroactively recolor all the constituent packages and keep the TidierOrg as blue. I'll first stick to the blue and bc I agree it's something of a trademark.
kdpsingh commented 1 year ago

Feel free to make it colorful if you'd like! As long as the main Tidier.jl one stays as the same blue.

It may be cool if we end up making stickers to have different-colored ones. Totally up to you!

pdimens commented 1 year ago

Some drafts I slapped together today image image image image

kdpsingh commented 1 year ago

Love it! Only suggestion would be to make the background colors more different from one another so that it produces a more rainbow-like effect. Right now the pink and purple are kind of close to one another, as are the two greenish ones.

kdpsingh commented 1 year ago

Also tagging @drizk1 to take a look.

drizk1 commented 1 year ago

@pdimens these logos look great thank you! I do agree with Karandeep that a little more contrast (perhaps shifting to a navy for one ?) between them might be nice to help them really pop when they're in print one day. Thanks for the great work tho!

kdpsingh commented 1 year ago

Also, last thought: feel free to change the color on the TidierPlots logo too. The only one that should remain the same color is Tidier.jl. Thank you so, so much! So excited.

pdimens commented 1 year ago

How about these: image image image image image

kdpsingh commented 1 year ago

Looks great to me! Only thing I'd consider is making TidierPlots background darker so that the white lettering is more readable. The grey background is actually kind of funny since it's the default background theme in ggplot.

@drizk1 and @rdboyes what do you think? If everyone agrees, feel free to tweak and share files with me or add to PRs.

kdpsingh commented 1 year ago

Looking at it again, I can definitely read TidierPlots so free to leave as-is or just darken the outer portion slightly. I trust your judgment!

pdimens commented 1 year ago

The grey background was a whimsical nod to ggplots default theme, yeah. Haha. I'll make it a bit darker

drizk1 commented 1 year ago

These look great! The softness of the colors is really nice.

pdimens commented 1 year ago

Here are 3 variants. My favorite is number 2

option 1

image

option 2

image

option 3

image

rdboyes commented 1 year ago

I like number 1, but they all look good!

kdpsingh commented 1 year ago

I like number 3 but I'm also good with number 2. @rdboyes, thoughts?

pdimens commented 1 year ago

Lol fight amongst yourselves.

kdpsingh commented 1 year ago

This package is @rdboyes's baby so I would go with number 1.

pdimens commented 1 year ago

aiight. PRs incoming

kdpsingh commented 1 year ago

Yay, awesome. Question: Can we also update the Tidier.jl logo to include plotted dots (to indicate that it’s a meta-package that will enable data analysis and plotting?) It’s essentially going to re-export all of the other Tidier* packages.

pdimens commented 1 year ago

@kdpsingh like dis? image

kdpsingh commented 1 year ago

Purrrrerfect (as the TidierCats would say). Let's do it!

jdiaz97 commented 1 year ago

Not part of the tidyverse, but the '"rio" package from R is quite nice. Does Julia have something like that? if not, do you guys think an implementation would be good?

kdpsingh commented 1 year ago

It does!

FileIO already handles multiple different formats with a single function for reading and another for writing.

https://juliaio.github.io/FileIO.jl/stable/

kdpsingh commented 1 year ago

Here are all the supported formats: https://juliaio.github.io/FileIO.jl/stable/registry/#Registry-table

frankiethull commented 1 month ago

Hi!

I am building with some tables in Julia and was thinking about you all... Is a TidierTables.jl part of the roadmap? Was thinking along the lines of of the gt package in R.

kdpsingh commented 1 month ago

There's a package PrettyTables.jl that is currently undergoing a remake inspired by the {gt} R package: https://discourse.julialang.org/t/current-state-and-the-future-of-prettytables-jl/118455

I would check in on that package.