Closed henryennis closed 4 months ago
@henryennis The intent was for the indexing to behave like python list indexing:
data = range(1000)
data[0:511] # len = 511
data[:511] # len = 511
data[0:512] # len = 512
data[:512] # len = 512
Yeah the only way to find out if it is implemented that way is to look at the source code. Theres also some unexpected behaviour when assigning an index as None or 0 for either start or end.
I improved the documentation as part of https://github.com/ibm-granite/granite-tsfm/pull/95 to make the functionality clearer.
If I have a dataset of length 600 and I want to get the first 512 rows, I would set the start index to 0 and the end index to 511.
inference_data = select_by_index( data, id_columns=ID_COLUMNS, start_index=0, end_index=511, )
The implementation of select_by_index will return a dataset of length 511 because of a truthy condition problem in the following code and the non inclusivity of slicing from the end [:x]...