matlab-deep-learning / transformer-models

Deep Learning Transformer models in MATLAB
Other
206 stars 61 forks source link

Provide sparse cross entropy implementation #17

Open bwdGitHub opened 2 years ago

bwdGitHub commented 2 years ago

We would like to use these issues to gauge user interest.

Sparse cross entropy allows the computation of cross entropy loss without one-hot encoding of the target class. This is useful for language modeling as the target classes are the entire vocabulary which is a very large space to one-hot encode, and wouldn't be memory efficient.

It is possible to make a custom implementation of sparse cross entropy computation with dlarray.