Tiny bug in preprocessing.py

harvardnlp / seq2seq-attn

Sequence-to-sequence model with LSTM encoder/decoders and attention

http://nlp.seas.harvard.edu/code

MIT License

1.26k stars 278 forks source link

Tiny bug in preprocessing.py #75

Closed helson73 closed 7 years ago

helson73 commented 7 years ago

In "preprocess.py", line 331 l_location.append(len(sources)) should be revised to l_location.append(len(sources)+1) otherwise last sentence would be missing. l_location's element always follows base index of 1 (lua), and the last element of l_location should point to len(sources), which is len(source) + 1 in lua.

yoonkim commented 7 years ago

thanks! this has been fixed

helson73 commented 7 years ago

@yoonkim BTW, in preprocess-shards.py, line 201 has the same problem :)

yoonkim commented 7 years ago

of course :). fixed