Open arthurPignet opened 3 years ago
Is this still relevant @arthurPignet ?
Yes it is. The split of the data between partners is label agnostic, but it is not he case of the shuffling/corruption Basically, the only type of labels accepted is one-hot.
Currently most of the functions work with the assumptions that the labels are one-hot-encoded vectors.
Besides the fact that it is not responsive, sometime we need to play with label index. (first label is indexed 1, second label 2, and so on) A solution can be to add (automatically) at the dataset generation a dict of label, where the keys would be integer and the values would be the vectors for instance.
An instance with MNIST :
dataset.dic_label = { 0: [1,0,0,0,0,0,0,0,0,0], 1 : [0,1,0,0,0,0,0,0,0,0], 2 : [0,0,1,0,0,0,0,0,0,0], 3 : [0,0,0,1,0,0,0,0,0,0], 4 : [0,0,0,0,1,0,0,0,0,0], 5 : [0,0,0,0,0,1,0,0,0,0], 6 : [0,0,0,0,0,0,1,0,0,0], 7 : [0,0,0,0,0,0,0,1,0,0], 8 : [0,0,0,0,0,0,0,0,1,0], 9 : [0,0,0,0,0,0,0,0,0,1]}