JuliaData / IndexedTables.jl

Flexible tables with ordered indices
https://juliadb.org
MIT License
121 stars 37 forks source link

Improve error message when trying to extract index column #78

Closed andreasnoack closed 6 years ago

andreasnoack commented 7 years ago

I guess the rule is that you can only extract data columns and not index columns but it would be great if this could give a better error than ERROR: type TypeofBottom has no field parameters

julia> hitemps = Table(Columns(City = [fill("New York",3); fill("Boston",3)],
                              Data = repmat(Date(2016,7,6):Date(2016,7,8), 2)),
                              Columns(Temperature = [91,89,91,95,83,76]))
City        Data       │ Temperature
───────────────────────┼────────────
"Boston"    2016-07-06 │ 95
"Boston"    2016-07-07 │ 83
"Boston"    2016-07-08 │ 76
"New York"  2016-07-06 │ 91
"New York"  2016-07-07 │ 89
"New York"  2016-07-08 │ 91

julia> map(i -> i.City, hitemps)
ERROR: type TypeofBottom has no field parameters
Stacktrace:
 [1] strip_unionall(::Type) at /home/andreasnoack/.julia/v0.6/IndexedTables/src/utils.jl:305
 [2] _map(::Function, ::IndexedTables.Columns{NamedTuples._NT_Temperature{Int64},NamedTuples._NT_Temperature{Array{Int64,1}}}) at /home/andreasnoack/.julia/v0.6/IndexedTables/src/IndexedTables.jl:377
 [3] map(::Function, ::IndexedTables.IndexedTable{NamedTuples._NT_Temperature{Int64},Tuple{String,Date},IndexedTables.Columns{NamedTuples._NT_City_Data{String,Date},NamedTuples._NT_City_Data{Array{String,1},Array{Date,1}}},IndexedTables.Columns{NamedTuples._NT_Temperature{Int64},NamedTuples._NT_Temperature{Array{Int64,1}}}}) at /home/andreasnoack/.julia/v0.6/IndexedTables/src/IndexedTables.jl:388
shashi commented 7 years ago

btw, you can do map(i->i.City, rows(hitemps)) or map(i->i.City, keys(hitemps)), equivalently:

keys(hitemps, :City)
column(hitemps, :City)
columns(hitemps, :City)
rows(hitemps, :City)

All give the same result.

andreasnoack commented 7 years ago

Thanks. That was a lot of ways to do the same thing but I can see how each of the functions can be useful.

davidanthoff commented 7 years ago

I can see that it is super useful to only iterate the data column when there is only one data column and it is scalar, but as soon as there are multiple data columns, it seems almost more useful to have the rows behavior be the default... Probably just a trade-off which use-case one wants to prioritize...

shashi commented 7 years ago

Makes sense to me too. The array representation is most useful when there is a single vector of scalar for data... But having a different default behavior for Columns seems worse off than the present situation.