Open tcovert opened 6 years ago
I'll fix the iterable tables integration story for the new style API here. Then this
load("file.scv") |> table
# or
table(load("file.csv"))
should work.
I've also been sitting on a small extension to TableTraits.jl that will remove the perf overhead of going through the iterable tables system, i.e. at that point it should give the raw performance of TextParse (which is used by the FileIO story under the hood).
Ha, I just realized that this new NextTable
is already an iterable table source, without any integration code, simply because it iterates named tuples! Very nice. So all of the following already works:
t = table(...)
# File IO
save("file.csv", t)
save("file.feather", t)
t |> save("file.csv")
t |> save("file.feather")
# Convert to other table structure
df = t |> DataFrames.DataFrame # This pipe syntax should work for all constructors of table types
df = DataFrames.DataFrame(t)
tt = TypedTables.TypedTable(t)
Pandas.DataFrame(t)
TimeSeries.TimeArray(t)
Temporal.TS(t)
# Run a regression
lm(@formula(Children~Age),t)
# Plot with Gadfly
plot(t, x=:Age, y=:Children, Geom.line)
# Plot with StatPlots
@df t plot(:Age, :Children)
# Plot with VegaLite
forgot the syntax ;)
And no, I haven't tried, but it really should just work. I hope those aren't famous last words.
There was some discussion about this on the data channel on Slack. If we ever move Columns out of this package, then I'll make it so that TextParse.csvread returns Columns -- then this should just work. For now, IterableTables interface sounds fine.
The table traits extension I’ve mentioned will actually end up just passing the arrays from TextParse to tables directly without any iteration, so at that point it should add hardly any overhead at all. Should all be completely transparent in terms of user facing API, i.e. the code from above will be the same, just much faster. Probably also won’t require any code changes in IndexTables either.
I couldn't find an example in the documentation for how one builds an
IndexedTable
, with column names, from the output ofTextParse.csvread
. This appears to work (took me a while...):I would have thought something like
table(csvread("file.csv"))
or eventable(Columns(csvread("file.csv")))
would work but both give an error like:I guess this isn't a bug, but if the goal is for the first syntax above to be the recommended syntax, it might be helpful for there to be an example in the docstrings...