Closed hit-joseph closed 6 years ago
one more question,in the file named demo.train.config ,there are some Hyperparameters: cnn_layer=4 char_hidden_dim=30 I can't really uderstand what those 2 parameters means, do you mind explain it? Thanks
You may refer the configuration explanation here: https://github.com/jiesutd/NCRFpp/blob/master/readme/Configuration.md
thank you .this file I have read already , so I JUST want to make it sure, if I use lstm as word-sequence layer, cnn_layer is useless and can be annotated, and the parameter of char_hidden_dim=50 means the feature i extract from word and dim is 50 ,and than joint it after word2vec and pos_vec?
Do you mean "ignored" rather than "annotated"? Yes, if you choose the LSTM to encode the word sequence then the settings of CNN can be ignored.
About the char_hidden_dim=50: "the feature i extract from word and dim is 50", this is right. "joint it after word2vec and pos_vec", it is concatenated with word embeddings(not word2vec) and feature embeddings.
if I SET char_emb_dim=100 (this parameter in I/O part) char_hidden_dim=50(this parameter in Hyperparameters part) is that means i input the pretrained char_embedding which in 100 dim ,and after cnn_char_layer I GET char_features_embdding in 50dim?and it can concatenate with word embedding?
exactly.
thank you very much!
先膜拜大佬: 我想把这个模型用在一个中文的序列标注问题上:这里面有POS的标记:这个和CNN_character的特征冲突吗,你的项目里面是手动标记特征和CNN_character的特征可以共存吗?另外看了一下数据的预处理的格式:Friday [Cap]1 [POS]NNP O ,我只用到了POS的特征数据是不是应该写成Friday [POS]NNP O ,[POS]是必须要的吗,还是你只是作为一个标记? 烦请指教