simple_gru怎么共享所有参数？怎么指定是否返回序列？

PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）

http://www.paddlepaddle.org/

Apache License 2.0

22.23k stars 5.58k forks source link

simple_gru怎么共享所有参数？怎么指定是否返回序列？ #10443

Closed yttbgf closed 6 years ago

yttbgf commented 6 years ago

我需要实现一个共享的子网络，分别用于相同结构的不同的数据切片，其中包括一层返回序列的GRU，后面接一层返回非序列的GRU，如何实现？

Yancey1989 commented 6 years ago

可以通过设置gru_param_attr和gru_bias_attr的方式来共享参数，参考： https://github.com/PaddlePaddle/Paddle/blob/bb3247e33973ca02d900421e7f823214f4b0a067/python/paddle/trainer_config_helpers/tests/configs/shared_gru.py#L24-L33

返回序列的问题还要请教下 @qingqing01

qingqing01 commented 6 years ago

@yttbgf GRU层本身返回的就是序列(比如，输入N个句子，输出也是N个句子)。我理解非序列的GRU，是加first/last instance, sequence_pooling(max/avg)等吧。

yttbgf commented 6 years ago

@qingqing01 bidirectional_gru 为什么有return_seq这个参数区分是否返回序列？

qingqing01 commented 6 years ago

http://www.paddlepaddle.org/docs/develop/api/en/config/networks.html#bidirectional-gru 文档中解释了：

return_seq (bool) – If set False, the last time step of output are concatenated and returned. If set True, the entire output sequences in forward and backward directions are concatenated and returned.

返回序列：后面可以继续接RNN网络，比如多层的双向RNN网络，要返回序列。返回非序列：这里对forward_gru的最后一个时间步和backward_gru的第一个时间步concat到一起，返回，这个对句子的聚合(或称poooling)操作，将输出变成无序列状态，比如可以表示整个句子学到的特征，直接用来做分类等。

Yancey1989 commented 6 years ago

由于长时间没有更新的信息，先关闭这个issue了，如有进一步反馈请随时重新打开，多谢！