Open zoezyn opened 2 years ago
If you want to reduce the memory usage and improve the training speed, I would recommend you to reduce the hidden size "hidden_dim" (e.g. 100) and the number of layers "num_layers" (e.g. 2).
Moreover, the current code by default uses bidirectional MDRNN, as shown in "./models/joint_models": https://github.com/LorrinWWW/two-are-better-than-one/blob/b09a0f9fb1dff1ef6d4f4ade005099488e3b8111/models/joint_models.py#L183 To accelerate training, you can modify direction="B" to direction="" so as to use unidirectional RNN ("direction" could be "", "B", or "Q", which represent unidirectional, bidirectional, and quaddirectional RNN respectively).
If you still want to use CNN instead of MDRNN, you can create your version of table encoder, and replace the MDRNN-bsed one with it: https://github.com/LorrinWWW/two-are-better-than-one/blob/b09a0f9fb1dff1ef6d4f4ade005099488e3b8111/models/joint_models.py#L160-L161 If you are not in a hurry, I will look into this in the near future. Thank you for your understanding!
Thank you very much for the suggestions! I would still like to try with the CNN. I am looking into the code. I think I only need to change the module in gru.py right?
Hello! I am trying your model on my tiny dataset. Would it be possible for you to publish or send me the model with CNN instead of MD-RNN? Since CNN is enough for my small dataset, I would like to reduce the complexity of the model.