Open cigrainger opened 9 months ago
I'd love to see cheatsheets too! Personally, I think it would've saved me some documentation hunting.
For example, early on I saw there was a DataFrame.filter/2
but I couldn't find a corresponding Series.filter/2
. I eventually found Series.mask
, but it wasn't obvious. Something like a task-oriented cheatsheet would've made that much easier to find.
I think the diagrams are really powerful and got myself stuck trying to figure out how to replicate them in vega lite.
I also feel like I've lost a good bit of time trying to get vegalite to output specific visualizations. It's great for a lot of cases. But when you start doing non-data-driven things like annotating your visualization, I found that it gets tricky.
Were you thinking of doing a .cheatmd
? Or a .pdf
?
Glad to hear it would be helpful! I was thinking of doing a .cheatmd
so we can easily put it into the docs. I feel like it should be possible to get close with heatmaps based on specific categorical values. And I also agree that it's confusing we have Series.mask/2
instead of Series.filter/2
.
@philss or @josevalim I'm sure there was a convo about this but I can't remember what the reasoning was for this anymore. https://github.com/elixir-explorer/explorer/pull/326#issue-1352162455
We changed the implementation and renamed at the same time but I am fine with reverting the name back to filter
. :) It should be a quick change and we can add:
@deprecated "Use Explorer.Series.filter/2 instead"
def mask(s1, s2), do: filter(s1, s2)
It will certainly be much easier to find.
Oh I didn't mean to pick up a stray issue! I was just using it as an example.
We changed the implementation and renamed at the same time but I am fine with reverting the name back to
filter
.
It may be worth having both since mask/2
and filter/2
accept different datatypes: mask
takes a boolean series while filter/2
/filter_with/2
take a query/function. If you have the boolean series on hand, you'd want mask/2
. But if you're finding that you need to build the boolean series e.g. with transform/2
only to pass it right into mask/2
, filter/2
would be convenient.
I feel like it should be possible to get close with heatmaps based on specific categorical values.
I agree! I was more worried about the arrows:
Though if you could embed the diagrams in a table, you could probably achieve a similar effect w/o the need for the arrow annotations.
It may be worth having both since mask/2 and filter/2 accept different datatypes: mask takes a boolean series while filter/2/filter_with/2 take a query/function.
The issue is that doing it with a function is horribly expensive and should be generally avoided.
Hey I wrote this a little after the earlier discussion. If it's not helpful just ignore me :)
https://vega.github.io/editor/#/gist/e0675e1408ba1944deb1a747f03a060d/spec.json
DPLYR | VegaLite |
---|---|
Note that it does appear to be possible to add margins to the rectangles:
https://vega.github.io/vega-lite/examples/rect_mosaic_labelled_with_offset.html
But my cursory reading of that example makes it seem a bit complex:
{
"calculate": "datum.y + (datum.rank_Cylinders - 1) * datum.distinct_Cylinders * 0.01 / 3",
"as": "ny"
},
Super helpful! Thank you @billylanchantin! I also don't think the margins are too important :).
I'm excited about cheatsheets and something like this would beat "Ten minutes to Explorer", especially for those coming from dplyr or pandas who just need an easy reference.
dplyr: https://nyu-cdsc.github.io/learningr/assets/data-transformation.pdf also dplyr: https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf pandas: https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf
I started playing over the weekend but I think the diagrams are really powerful and got myself stuck trying to figure out how to replicate them in vega lite.