Closed datason closed 4 years ago
Hi @datason ! Sincerely thank you for giving me nice + kind comment :) You're absolutely right, and i will fix the line that you mentioned (or if you leave pull request, I will merge). I think of that your comment makes this tutorial better.
Hi, @lyeoni ! You have written great tutorials. I really appreciate you) We can improve a little bit with one pretty line. Look, please) Here, we fill first key-value items of stoi, itos by special tokens. I suggest insert this line before cycle.
special_tokens = filter(lambda x: x is not None, [self.unk_token, self.bos_token, self.eos_token, self.pad_token])
If we don't set value forself.unk_token
and set forself.bos_token
, then index in dictionary become wrong. So, we need filter None values before. Inputvocab = Vocab(body, bos_token='<bos>'); vocab.build(); vocab.stoi;
Wrong Output'<bos>': 1 ' ': 1, 'hi': 2, 'bear': 3, ...