tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.5k stars 3.49k forks source link

a question about the code of universal transformer #1004

Closed yinboc closed 6 years ago

yinboc commented 6 years ago

https://github.com/tensorflow/tensor2tensor/blob/2f8423a7daf39c549fa4f87d369d3ff95e719e6c/tensor2tensor/models/research/universal_transformer_util.py#L1207

Is this supposed to be "(previous_state (1 - update_weights))" instead of "(previous_state 1 - update_weights)"?

Thanks.

akikaaa commented 6 years ago

It makes sense to be new_state = transformed_state * update_weights +previous_state given that update_weights is just p_t ^n (in some sense) and previous_state is just the sum of previous p_t^n*s_t^n