Moreover, the random.choice does repeat some indices.
If this is not the intended use, I suggest to switch to the following:
# create indices
chosen_idx = np.arange(0, len(intervals))
# shuffle the indices
np.random.shuffle(chosen_idx)
chosen_idx = chosen_idx[:int(len(intervals) * 0.8)]
chosen_idx_mask = np.zeros(len(intervals), dtype=bool)
chosen_idx_mask[chosen_idx] = True
# select train and test intervals
train_intervals = intervals[chosen_idx_mask]
# take the rest as test intervals
test_intervals = intervals[~chosen_idx_mask]
In neuroformer/datasets.py the line
test_intervals = intervals[~chosen_idx]
does not create a complementary set wrt the train one.The tilde operator applied on integers, inverts the bits representation. (see also https://stackoverflow.com/questions/8305199/the-tilde-operator-in-python).
Moreover, the random.choice does repeat some indices.
If this is not the intended use, I suggest to switch to the following: