Currently we automatically create index (i.e. coord) arrays upon constructing a Dataset. This is done for consistency and is mostly fine, but it does cause problems in a few cases. For example, we automatically add a singleton chain dimension with index Base.OneTo(1). Now if we concatenate 4 chains, DimensionalData creates the new index array [1, 1, 1, 1], which is useless (and probably should throw an error).
It's cumbersome for users to change such dummy values to new ones before concatenating.
I propose that whenever the user has not provided a set of indices, we don't create any. By default then, DimensionalData uses as indices for the dimensions the axes of the underlying array. Only under very rare circumstances (e.g. the user is using OffsetArrays to represent draws with 0-based indexing) will this not be what the user wants, and they can always specify the indices under these conditions.
Currently we automatically create index (i.e. coord) arrays upon constructing a
Dataset
. This is done for consistency and is mostly fine, but it does cause problems in a few cases. For example, we automatically add a singleton chain dimension with indexBase.OneTo(1)
. Now if we concatenate 4 chains, DimensionalData creates the new index array[1, 1, 1, 1]
, which is useless (and probably should throw an error).It's cumbersome for users to change such dummy values to new ones before concatenating.
I propose that whenever the user has not provided a set of indices, we don't create any. By default then, DimensionalData uses as indices for the dimensions the axes of the underlying array. Only under very rare circumstances (e.g. the user is using OffsetArrays to represent draws with 0-based indexing) will this not be what the user wants, and they can always specify the indices under these conditions.
As a bonus, this creates less visual clutter:
Similarly, when reading from netCDF, if the read indices are equivalent to the axes of the arrays, we could just discard them.