kamalkraj / Named-Entity-Recognition-with-Bidirectional-LSTM-CNNs

Named-Entity-Recognition-with-Bidirectional-LSTM-CNNs
GNU General Public License v3.0
359 stars 142 forks source link

Different understanding about the original paper #4

Open lanzhuzhu opened 6 years ago

lanzhuzhu commented 6 years ago

In README, you said: “The model produces a test F1_score of 90.9 % with ~70 epochs. The results produced in the paper for the given architecture is 91.14 ” In fact, the paper said the result 91.14 is produced under the situation "All other hyper-parameters and features remain the same as our best model in Table 5", that is , lex feature is used, while you do not use that feature, so this architecture can not reach 91.14.

kamalkraj commented 6 years ago

Correct . I didn't use any lex feature so model can't reach 91.4

On Tue, 12 Jun 2018 at 09:32, lanzhuzhu notifications@github.com wrote:

In README, you said: “The model produces a test F1_score of 90.9 % with ~70 epochs. The results produced in the paper for the given architecture is 91.14 ” In fact, the paper said the result 91.14 is produced under the situation "All other hyper-parameters and features remain the same as our best model in Table 5", that is , lex feature is used, while you do not use that feature, so this architecture can not reach 91.14.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kamalkraj/Named-Entity-Recognition-with-Bidirectional-LSTM-CNNs/issues/4, or mute the thread https://github.com/notifications/unsubscribe-auth/AQTgmg44TgVmr_guDHMjbgd6hjvMBrc8ks5t7z1XgaJpZM4UjvlI .

davidsbatista commented 6 years ago

Did you use Viterbi to do the decoding of the best sequence ? That might also explain the different results

shuiyueche commented 5 years ago

Did you use Viterbi to do the decoding of the best sequence ? That might also explain the different results

If I am not wrong, the code here does not include the transition matrix of the tags. So no need to apply Viterbi here. But this is also a big difference.