queryverse / QuickVega.jl

6 stars 4 forks source link

Contributing #4

Open elalaouifaris opened 4 years ago

elalaouifaris commented 4 years ago

Hi,

I just saw the Juliacon video on the Vega ecosystem. I guess this package is a good way to start contributing? What should I check in the Queryverse ecosystem to start working on this? I'm a newbie in Julia development, but I worked with the Tidyverse in the R ecosystem. Switching to Julia now!

Thanks Faris

davidanthoff commented 4 years ago

Yes, that would be awesome!

My general idea for this package is that it just exports a bunch of simple functions that all return either a Vega.jl or VegaLite.jl spec. The idea generally is that the API in this package here is not grammar of graphics style, but just one-function-one-plot, super simple.

The prototype of that would probably be something like a histogram function:

function histogram(data)
  return @vlplot(:bar, x={data, bin=true}, y="count()")
end

or something like that. It doesn't save you a lot of code, but I for example can never remember the details of this, so just having a short and sweet function for something as common like this seems really helpful.

Broadly speaking, I think there are two categories of potential functions:

1) Simple functions for very simple plots that are currently a bit too verbose when you create them with @vlplot. The histogram case is an example of this. I should say, that for some of these my hope is to eventually make these cases simpler in the VegaLite.jl repo itself, or in vega-lite, so if we add some of these here it might be more a short to medium term fix. To stick with the histogram example: I think the vega-lite team might add a new mark histogram at some point, and then creating a histogram with the @vlplot macro would look like @vlplot(:histogram, data). I think at that point that is concise enough that we wouldn't need a histogram function in this package here anymore. But, that is future talk, and I think in the meantime some of these shorter functions would really help.

2) really complicated plots that one can create with one simple call. Seaborn has a lot of examples for that. Take for example the pair plot from here. I don't think creating something like that will ever be concise in original VegaLite.jl, so having more like "shortcut" functions here that do this would be great.

There was also an older version of the Vega.jl package that had an API that was more in this spirit, the documentation is still up here, and there might be ideas there as well.

I think one nice thing about this package is that one can get started very piecemeal, by just adding say one new function. No need to integrate with some pre-existing complicated architecture, every function here on its own helps :)

hsm207 commented 4 years ago

@davidanthoff Is it okay to dump all functions in QuickVega.jl for now or is there already something like a design about how the source code should look like?

elalaouifaris commented 4 years ago

@davidanthoff Thank you for the info. Maybe we can start by adding a few issues to document with few example functions and gradually launch pull requests for them? It's a really good case for first contributions, it can give the opportunity for new contributors (like me) to get the feel for the github workflow.

davidanthoff commented 4 years ago

I think we could just start by dumping things into the repo and then discuss based on that? I really don't have any strong ideas for this beyond what I wrote above, so I think this would be very much a collaborative trial and error process for those that want to participate :)

I think maybe a good strategy might also be to say that QuickVega.jl is in experimental state for the next six months or something like that, and during that time we might at any time change everything, i.e. we would provide zero backwards compat promises. That might allow us to really experiment without complicated backwards compat constraints.

So I think just PRs like https://github.com/queryverse/QuickVega.jl/pull/5 are great: we can just start, see what works and take it from there.

Really excited to see that you all are interested in contributing!!

davibarreira commented 4 years ago

Hey guys, are you still working on this package? I didn't know that this initiative existed, so I was starting to implement it myself, with a package called Aquarius.jl.

hsm207 commented 4 years ago

@davibarreira I'm still interested in contributing but I just don't have any idea what visualizations to start with. If you have ideas, I'd be happy to help implement them here.

davibarreira commented 4 years ago

I created a README.md file and a logo for the project (personal hobby). I also added myself in the Project.toml (never contributed to open source, so I'm sorry if I wasn't supposed to do that).

hsm207 commented 4 years ago

@davibarreira I'd consider this package as active since davidanthoff is still around in the Julia community. He's probably busy with teaching and/or other projects.

Why don't we work on a fork of this project instead? I suggest this because I think it benefits this project in terms of discoverability to be part of the queryverse organization.

davibarreira commented 4 years ago

Yeah! David actually responded to me on Slack. The package is active, which is great. Working on the fork seems like a good alternative.

hsm207 commented 4 years ago

@davibarreira I see you've already forked the project. I have a local copy of that fork.

What should we get started with?

davibarreira commented 4 years ago

@hsm207 , I haven't planned much, but at the moment I guess we could start by implementing some of the basic plots, such as 2 of the pull requests have already done. For example, there is already an implementation of histogram and lineplot. Each implementation has a version using the data as DataFrame, and another using Arrays, which allows the package to be independent of DataFrames.jl. So I guess we could continue on this line, implementing some other basic plots, such as barplots, trailplots, distplots, etc. There are many other things to think about, such as how to allow kwargs to functions, so we can pass attributes such as color, field types, markers, labels, etc. Later I will try to write down the list of things that I think would be cool to have in the package, and I'll share it in my fork, so we can have some sort of road map to start with.

davibarreira commented 3 years ago

Hey @davidanthoff , I've finally restarted working on QuickVega, and will shortly send a pull request with several contributions. I know that you didn't want to make the package dependent on DataFrames.jl, but I've encountered some issues (I'll clarify in the pull request), so for now some of my code does depend on DataFrames. I've decided to keep on going like this for now, but the code can be easily refactored later to be made independent of DataFrames. Although, now that DataFrames is in version 1.0, perhaps it's more stable and can become a dependence. What do you think? Cheers!

davidanthoff commented 3 years ago

Great to hear this and looking forward to the PR!

I think at this stage I'm primarily excited about any contribution to this package, and we can figure things like DataFrames.jl dependencies out as we go along.

davibarreira commented 3 years ago

The code is still very undocumented, so I'll be doing some due diligence, and then submit the pull request.