This PR adds a new feature to calculate categorical cross-entropy on multi hot sparse labels
Inputs are softmax predictions and true labels.
It should return same loss as categorical cross-entropy.
Example:
Input:
input data is random images of size (32, 32) in channels first data format
Labels:
Tradition labels are in the shape of (num_samples, num_class), for example:
labels = [[0, 1, 1, ..., 0],
[1, 1, 0, ..., 0],
...
[0, 0, 0, ..., 1]]
where len(labels) = num_samples, len(labels[0]) = num_classes
However, when num_classes are very large and labels are very sparse,
we can represent them differently, for example:
There are total 1000 classes, so there will be 10000 different labels.
Each image can belong to at most 5 labels at the same time.
labels = [[1, 2],
[0, 1],
...
[999]]
where labels is a list of list
Special Note:
To deal with different length of sparse labels, we pad them with negative values,
so we can differentiate padding values with normal labels. It will become:
padded_labels = pad_sequeences(labels, value=-1)
padded_labels = [[-1, -1, -1, 1, 2],
[-1, -1, -1, 0, 1],
...
[-1, -1, -1, -1, 999]]
It will have shape (num_samples, 5) which still save space compare to dense labels.
Changes
Add implementation for loss and metrics
Add examples at examples/multi_hot_sparse_categorical_crossentropy.py
Add unit tests for loss and metric
Performance:
X3 faster for calculating 3000 classes multi labeled sparse labels. (5 labels at most for each data)
('categorical crossentropy loss time per epoch:', 0.7335219383239746)
('multi hot sparse categorical crossentropy loss time per epoch:', 0.23781204223632812)
PR Overview
[x] This PR requires new unit tests [y/n] (make sure tests are included)
[x] This PR requires to update the documentation [y/n] (make sure the docs are up-to-date)
Summary
This PR adds a new feature to calculate categorical cross-entropy on multi hot sparse labels Inputs are softmax predictions and true labels. It should return same loss as categorical cross-entropy.
Example:
Changes
PR Overview