blaze / datashape

Language defining a data description protocol
BSD 2-Clause "Simplified" License
183 stars 65 forks source link

Implement kinds (`Fixed`, `Scalar`, etc...), as in in Dynd #203

Open jcrist opened 8 years ago

jcrist commented 8 years ago

In blaze/datashape var is used as the dimension of a table with unknown length (e.g. "var * {a: int32, b: float64}". In DyND this would be "Fixed * {a: int32, b: float64}", as var indicates variable length elements. Since both DyND and Datashape ideally use the same type system, we should remedy this.

A good first step would be implementing the "Kinds", which are similar to, but different from TypeVar. Basically, kinds match on properties of the value they match, but not all matches are equivalent. @Izaid will be able to clarify more here. This will be best illustrated by example:

# A square array of int32
"N * N * int32"
# A 2d array with fixed dimensions, but not necessarily square
"Fixed * Fixed * int32"
# This would match
"10 * 3 * int32"
# But not
"10 * var * int32"
# which would be a "ragged array" in dynd.
izaid commented 8 years ago

This is great, thanks @jcrist. I'm +1 to this, with the caveat that current Datashape should only get those kinds that are really necessary. Of which Fixed is definitely an example.

kwmsmith commented 8 years ago

I'm in favor of Datashape and DyND alignment here, too.

Implementing the Kinds in Datashape will be an improvement and is not an issue.

Using var properly in Datashape and Blaze will be a significant breaking change. For that reason, we will have to consider the cost of this change, and likely have a deprecation release before implementing fully.