aneesh-joshi / LSTM_POS_Tagger

A simple POS Tagger made using a Bidirectional LSTM using keras trained on the Brown Corpus
34 stars 19 forks source link

Padding without masking #8

Open rrsayao opened 6 years ago

rrsayao commented 6 years ago

I noticed you're padding your sequences but you never use mask_zero=True in your embedding layer. Does this not cause your progress to be based on correctfully guessing where the paddings are?

If I'm correct you could be guessing the output to the sentence "[ 0, 0, 0, ..., "no"] as "[ 0, 0, 0, ..., "yes"] and still get to 99% accuracy.

aneesh-joshi commented 6 years ago

@cerulean331 Sorry for replying so late!

Unfortunately, yes. I wasn't aware of masking at the time of writing this. I know about it now. When I get time, I will make the changes and test the difference.

The convention, at the time of writing, as per my understanding was, the model will learn that 0 means we don't care about it and it would act as a no-op

Feel free to set mask_value=0 # I believe that's the new parameter. I can't imagine it having any side effects. If you're able to test it and show improvements, please make a PR. :)

Also, please look at the other branch in the repo. I've made some changes since.