Open ezmiller opened 3 years ago
When we added the index structure to tech.ml.datset (see https://github.com/techascent/tech.ml.dataset/pull/214), we prevented a column from returning an index if there are missing values in the column here. So this issue may not be relevant any more. There should always be an item in the first position because the column should not have any missing values.
In our current implementation of
adjust-interval
(See #14), we use the first item in a column to determine thetarget-datatype
of the time unit to which we are converting. More specifically,adjust-interval
takes a->new-time_converter
fn that the user supplies, and calls that function on the first item in the targeted column to get the new unit, and then usestech.v3.datatype/elemwise-datatype
to determine the unit's keyword.This is all fine, but @cnuernber raised a good point that we overlooked:
We should figure out a way to handle this case. It might also pay to generalize the process of determining the time datatype from the row if this is going to be a more common practice.