Open ericpan64 opened 3 weeks ago
Another idea (grouping here, though might be worth splitting out to another issue): why not just have everything be encapsulated in the select
DSL? E.g. "(a, b).count()"
-- to capture the idea of groupby
, and then a set of supported operations. So basically get more creative with the DSL string
Some other ideas:
X = select(large_df, "*[:-1] - someOtherCol")
) -- slicing or by nameplot_var = plot(some_df, cols=("x_colname", "y_colname"))
# and some nice ways to handle color, subplots, etc...
(original title: Think of nicer syntax for
join
andunion
DataFrame operations)Problem
Right now,
select
andgroup_by
might be useful, butjoin
andunion
don't really do much (i.e. it's just a wrapper to the base API). So feels weird if it doesn't also provide some ergonomic string syntaxRequested feature
For
join
: maybe can do something like:For
union
: maybe can make it easier to append rows formatted in different ways (e.g. as labeled dict, as tuples, with default values in some sparse way, etc.)Alternatives considered
-
Additional context
-