lspvic / CopyNet

CopyNet Implementation with Tensorflow and nmt
123 stars 52 forks source link

Question on parameters vocab_size and gen_vocab_size. #12

Open ybshen007 opened 5 years ago

ybshen007 commented 5 years ago

Thanks a lot for your job about CopyNet. I don't understand the parameters _vocab_size and gen_vocab_size clearly. For example, if i have a vocabulary table contains 9999 words and a special token "UNK", that is, size of the vocabulary table is 1w. And now i have a source sentence consists of 10 words, 5 of the words are not in the vocabulary table. So does it mean the parameter vocab_size is 10005(or 10010?) and gen_vocab_size is 1w? If so, when i use the CopyNetWrapper cell, should i calculate the maximum length of input sentences as a parameter of vocab_size_?

Thanks again Sam