Closed songeater closed 5 years ago
The attention concept can be used for real-valued targets as well, despite implementation in this repo does not allow to do that. If you want to, just changing softmax to another activation may not be enough - better look carefully on all places where yt
, ytm
, y0
used (in step
and get_initial_state
most likely) and change them appropriately (in current implementation there is argmax and than embeddings for one-hot vectors are used).
Thank you - yes the argmax/one-hot embeddings definitely have to be handled. Will post my modifications - if i am able to make them!
Hi - it seems that the original paper and this implementation addresses a target output that is one-hot encoded. To have this work with targets / y-values that are real numbers (I use lstms to experiment with non-quantized audio), would I just have to change the softmax activation in the yt calculation of the step() function below? Eg. change the activation to sigmoid or tanh?
Or does the attention concept, as described in the paper, not work with real-valued targets/outputs? I know that lstm models typically function best with quantized data/ one-hot vectors quashed with a softmax function... but real output is what i am playing with. This is neat work... thanks!