We would like to use these issues to gauge user interest.
Sparse cross entropy allows the computation of cross entropy loss without one-hot encoding of the target class. This is useful for language modeling as the target classes are the entire vocabulary which is a very large space to one-hot encode, and wouldn't be memory efficient.
It is possible to make a custom implementation of sparse cross entropy computation with dlarray.
We would like to use these issues to gauge user interest.
Sparse cross entropy allows the computation of cross entropy loss without one-hot encoding of the target class. This is useful for language modeling as the target classes are the entire vocabulary which is a very large space to one-hot encode, and wouldn't be memory efficient.
It is possible to make a custom implementation of sparse cross entropy computation with
dlarray
.