yikangshen / Ordered-Neurons

Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"
https://arxiv.org/pdf/1810.09536.pdf
BSD 3-Clause "New" or "Revised" License
577 stars 101 forks source link

about chunk_size, #21

Open zwd13122889 opened 4 years ago

zwd13122889 commented 4 years ago

what chunk_size meaning?

shawntan commented 4 years ago

From the paper:

As the master gates only focus on coarse-grained control, modeling them with the same dimensions as the hidden states is computationally expensive and unnecessary. In practice, we set f_t and i_t to be D/C dimensional vectors, where D is the dimension of hidden state, and C is a chunk size factor. We repeat each dimension C times, before the element-wise multiplication with f_t and i_t. The downsizing significantly reduces the number of extra parameters that we need to add to the LSTM. Therefore, every neuron within each C-sized chunk shares the same master gates