Closed NLP-1217 closed 4 years ago
Apologies for not responding timely. The authors actually experiment with different levels of connectivity (i.e., to previous nodes) in their paper. However, this repository currently contains a single version of the model they proposed, namely, the one which has the history parameter of (0, 0).
Hi, I have some questions about this model. I have read this paper that the author put two fully connect layers to predict DA. He chose (1,1) as history distance to predict, could plz tell me how to show this in your code? In your code, it seems not to consider this part.