Open luhua-rain opened 5 years ago
You need to be familiar with TorchText to understand the syntax. In model/data.py, we make a batch in which each context is parsed as a sequence of words and the corresponding length is counted (the same preprocessing holds for a sequence of bags of characters, c_char). This process is specified in the definition of a field for the words as below (in model/data.py, 31th line):
self.WORD = data.Field(batch_first=True, tokenize=word_tokenize, lower=True, include_lengths=True)
Note that include_lengths is set to True to include the length information of a sequence of the input words.
Thanks for your answer !!!!! Your model trained well on my data !! But when i predict(batch_size=1), i got that : start (400) + length (1) exceeds dimension size (400). when my batch_size > 1 , i can run , how can i solve it , i need to predict with batch_size=1
Thanks for your code !!! i wanna know what do they mean?