tensorflow / swift-models

Models and examples built with Swift for TensorFlow
Apache License 2.0
647 stars 147 forks source link

[WordSeg] Fix dropout issues #506

Closed sgugger closed 4 years ago

sgugger commented 4 years ago

Currently dropout is not used on the embeddings in the encoder and the decoder because it breaks AD (at least that's what the comment say). See here and there.

sgugger commented 4 years ago

So in both case, the dropout should be applied on the matrix of hidden states (I think S4TF returns a list of hidden states and not a matrix batchSize x SequenceLength x hiddenDimension).

dan-zheng commented 4 years ago

Dropout added in https://github.com/tensorflow/swift-models/pull/550.