Closed SparkJiao closed 5 years ago
You can compare the performance without the additional features, e.g. POS, NER. I think that is fair. And this code supports pytorch 0.4.1 now. I never added any trick but I followed DrQA. If your settings and data preprocessing are not changed, I guess the drop of performance is hidden in your code.
Thanks for your reply! I have compared the performance and I think the effect of additional features can improve about 1% F1 while validation. And I will check my code again. Very appreciate to your help!
So sorry to bother u again... I have transformed your code to read dateset from coqa and get a 62.8 F1 score which I think is pretty good than bidaf with self-attention. Because the code is written in pytorch 0.3 so I wanna rewrite it using based on allennlp and finally get a f1 score of 60.4. And I can't make it improved anymore. I have read the config in your code of memonic reader and I have set the hops as 2. Besides, I think the stackedRNN is the same as simple LSTM in pytorch while the num layers=1 so I just use a simple lstm. For embedding, I use glove for word without char because I have made a experiment finding that the effect of using glove for char is similar to simple cnn. For ner and pos of a word, I embedded them to vectors with length equals 20 and for lemma and em I embedded them with length 5. At last I append the tf features to the end of word vectors. I think the code is as the same with yours as possible. So I wanna ask you for some help if you have some tricks I haven't found. So sorry about it. Thank you so much!