JuliaData / Tables.jl

An interface for tables in Julia
https://juliadata.github.io/Tables.jl/stable/
MIT License
300 stars 52 forks source link

Is `NamedTuple[(a=1,), (b=1,)]` a table? #150

Open tkf opened 4 years ago

tkf commented 4 years ago

Table 1.0 and 0.2 say yes:

julia> Tables.istable(NamedTuple[(a=1,), (b=1,)])
true

But if I understand the interface specification correctly, I think the desired behavior is:

julia> Tables.istable(Vector{NamedTuple})
false

julia> Tables.istable(Vector{NamedTuple{<:Any,Tuple{Int}}})
false

julia> Tables.istable(Vector{NamedTuple{(:a,)}})
true

julia> Tables.istable(Vector{NamedTuple{(:a,),Tuple{Int}}})
true

However, all return true in Tables 1.0 and 0.2.

quinnj commented 4 years ago

A PR to cleanup the const RowTable{T} = AbstractVector{T} where {T <: NamedTuple} definition would be welcome; I will point out however that Tables.istable is already not 100% bulletproof in that there are lots of valid tables for which Tables.istable returns false. It'd obviously be nice to avoid false positives in code that Tables.jl does control, but I'm just pointing out that people relying on Tables.istable should hopefully be aware of its qualifications.

tkf commented 4 years ago

I think I understand the design choice to allow false negative of Tables.istable(::Type) (as you'd want to decide it at run-time sometimes). But isn't it better to treat false positive of Tables.istable(::Type) as a bug? Is there a case where it's useful to define possibly false positive Tables.istable(::Type)?

(Though I guess defining the meaning of the correctness of "positive" is a bit hard. Presumably, throwing from iterate(Tables.rows(table)) is OK even when Tables.istable(typeof(table)) is true if table is used in a "wrong way" (e.g., DB connection is closed). But I think all table interface should work for an object table if Tables.istable(typeof(table)) and table is a "valid" object by its own criteria. I think it is kind of like you can create invalid SubArray by mutating the index array; the live object does not satisfy the array interface even though its type is AbstractArray.)

knuesel commented 3 years ago

It would be nice to have Tables.istable return false for something like

julia> [(a=1, b=2), (c=3, d=4)]
2-element Vector{NamedTuple{names, Tuple{Int64, Int64}} where names}:
 (a = 1, b = 2)
 (c = 3, d = 4)

but what about the following?

julia> [(a=1, b=2), (b=3, a=4)]
2-element Vector{NamedTuple{names, Tuple{Int64, Int64}} where names}:
 (a = 1, b = 2)
 (b = 3, a = 4)

Currently this works as a table (e.g. I can do DataFrame([(a=1, b=2), (b=3, a=4)])). And both examples have the exact same type...

bkamins commented 3 years ago

This is unrelated to DataFrames.jl but how Tables.jl works:

julia> Tables.rowtable([(a=1, b=2), (b=3, a=4)])
2-element Vector{NamedTuple{names, Tuple{Int64, Int64}} where names}:
 (a = 1, b = 2)
 (b = 3, a = 4)

julia> Tables.columntable([(a=1, b=2), (b=3, a=4)])
(a = [1, 4], b = [2, 3])
Jay-sanjay commented 1 year ago

hi all is the issue resolved , if not I can work on it :)

bkamins commented 1 year ago

Here you have an explanation of the current design https://bkamins.github.io/julialang/2023/09/01/tables.html

However, I am not sure what the design changes should be if we made them (see @quinnj comment above)