Open chuckwondo opened 1 year ago
After discussion with Abigail Barenblitt, we landed on using the Python slice syntax: xvar[a:b]
This would allow users even better flexibility, such that a user can specify a specific start index (a
, 0-based) and stop index (b
, exclusive), in case the user does not want all columns of xvar
(for example).
Further, to select all columns, the syntax would be xvar[:]
, again, just like Python slice syntax.
Currently, when a user wants columns from a 2D dataset, each column must be explicitly indexed in the
columns
input value. For example, given a 2D dataset namedxvar
with n columns, if the user wishes to have all n columns appear in the output file, thecolumns
input value must includexvarX
for everyX
in the range0
through n - 1. For example, ifxvar
has 4 columns, and the user wants all of them in the output, thenxvar0,xvar1,xvar2,xvar3
must be included in thecolumns
input. For a small number of columns, this is acceptable, but when the 2D dataset contains more than a handful of columns, this is tedious, error-prone, and inconvenient.To make it easy for users to automatically get all columns of a 2D dataset in the output, we should support a shorthand notation within the
columns
input. I propose that we support the syntax*VAR
as a column name, whereVAR
is the name of a 2D dataset. For example, continuing from above, if*xvar
is part of thecolumns
input, we should automatically "spread" this intoxvar0,xvar1,xvar2,xvar3
, just as if the user had included such an expanded form in thecolumns
input value to begin with. This syntax is consistent with the Python syntax for iterable unpacking.However, since we don't know in advance how many columns a 2D dataset contains, this can be resolved by first implementing #47 with one of the proposed approaches, because both approaches specify a means to readily determine the number of columns in any supported 2D dataset.