blaze / castra

Partitioned storage system based on blosc. **No longer actively maintained.**
BSD 3-Clause "New" or "Revised" License
153 stars 21 forks source link

Add new columns? #29

Open mrocklin opened 9 years ago

mrocklin commented 9 years ago

Apparently people often want to add new columns that are results from the old columns.

Do we want to support adding another column onto an existing castra? We would assume that the new column has the exact same partition structure and number of elements per partition.

esc commented 9 years ago

We can do that. Would we need to store which partitions contain what columns? How would we deal with slices that have both partitions that contain a given colunmn and ones that don't? Should we return some kind of NotAvailable instead for the values in the missing column(s)?

mrocklin commented 9 years ago

The easy case is that we require new columns to exactly match and completely fill the partition structure. This is the common case in pandas workflows but probably not the common case in other timeseries.

esc commented 9 years ago

Easy case first sounds good, it would be sort-of a concat.