LorrinWWW / two-are-better-than-one

Code associated with the paper **Two are Better Than One: Joint Entity and Relation Extraction with Table-Sequence Encoders**, at EMNLP 2020
196 stars 47 forks source link

Custom dataset - change from MD-RNN to CNN #23

Open zoezyn opened 2 years ago

zoezyn commented 2 years ago

Hello! I am trying your model on my tiny dataset. Would it be possible for you to publish or send me the model with CNN instead of MD-RNN? Since CNN is enough for my small dataset, I would like to reduce the complexity of the model.

LorrinWWW commented 2 years ago

If you want to reduce the memory usage and improve the training speed, I would recommend you to reduce the hidden size "hidden_dim" (e.g. 100) and the number of layers "num_layers" (e.g. 2).

Moreover, the current code by default uses bidirectional MDRNN, as shown in "./models/joint_models": https://github.com/LorrinWWW/two-are-better-than-one/blob/b09a0f9fb1dff1ef6d4f4ade005099488e3b8111/models/joint_models.py#L183 To accelerate training, you can modify direction="B" to direction="" so as to use unidirectional RNN ("direction" could be "", "B", or "Q", which represent unidirectional, bidirectional, and quaddirectional RNN respectively).

If you still want to use CNN instead of MDRNN, you can create your version of table encoder, and replace the MDRNN-bsed one with it: https://github.com/LorrinWWW/two-are-better-than-one/blob/b09a0f9fb1dff1ef6d4f4ade005099488e3b8111/models/joint_models.py#L160-L161 If you are not in a hurry, I will look into this in the near future. Thank you for your understanding!

zoezyn commented 2 years ago

Thank you very much for the suggestions! I would still like to try with the CNN. I am looking into the code. I think I only need to change the module in gru.py right?