graykode / nlp-tutorial

Natural Language Processing Tutorial for Deep Learning Researchers
https://www.reddit.com/r/MachineLearning/comments/amfinl/project_nlptutoral_repository_who_is_studying/
MIT License
14.03k stars 3.9k forks source link

3-3.Bi-LSTM may have wrong padding #72

Open ETWBC opened 2 years ago

ETWBC commented 2 years ago

In line 16 you use input = input + [0] * (max_len - len(input)) the padding, you use 0, which means the first word 'Lorem'. but it is not the right choose. I think you can change like that

    # word_dict = {w: i for i, w in enumerate(list(set(sentence.split())))}
    # number_dict = {i: w for i, w in enumerate(list(set(sentence.split())))}
    word_dict = {w: i for i, w in enumerate(['PAD']+list(set(sentence.split())))}
    number_dict = {i: w for i, w in enumerate(['PAD']+list(set(sentence.split())))}