Some questions about model hyperparameter

xuuuluuu / SynLSTM-for-NER

Code and models for the paper titled "Better Feature Integration for Named Entity Recognition", NAACL 2021.

30 stars 2 forks source link

Some questions about model hyperparameter #1

Closed PengShi27 closed 2 years ago

PengShi27 commented 3 years ago

Hi, Thank you for sharing your code. The default value of num_lstm_layer in the code is 0. It feels a little unreasonable to set the value of num_lstm_layer to 0. I don't know how much you set the value of num_lstm_layer in the experiment. I get an error when I set the value of num_lstm_layer to 1.

xuuuluuu commented 3 years ago

Hi, thanks for the question. Our code is designed to run all the baselines and our model at the same time.

When setting "num_lstm_layer", which means you are also running with the standard LSTM. The error is because you set the model to use both Bi-LSTM and Syn-LSTM at the same time.

I suggest using the default configs, which is exactly our proposed structure. If you want to further test other customized model structures, you may need to make sure your structure is reasonable.

PengShi27 commented 3 years ago

Does this mean that Syn-LSTM is used when num_lstm_layer is set to 0?

xuuuluuu commented 3 years ago

That is right. Note that the flag --dep_model=dggcn (by default) is where we call both GCN and our Syn-LSTM model. The flag --num_lstm-layer is designed for running some standard LSTM baselines, and should be set to 0 (by default) when running our model.

PengShi27 commented 3 years ago

Thanks for your reply!