galsang / BiDAF-pytorch

Re-implementation of BiDAF(Bidirectional Attention Flow for Machine Comprehension, Minjoon Seo et al., ICLR 2017) on PyTorch.
244 stars 85 forks source link

what are batch.c_word[1] and batch.c_word[0] ? #12

Open luhua-rain opened 5 years ago

luhua-rain commented 5 years ago

Thanks for your code !!! i wanna know what do they mean?

galsang commented 5 years ago

You need to be familiar with TorchText to understand the syntax. In model/data.py, we make a batch in which each context is parsed as a sequence of words and the corresponding length is counted (the same preprocessing holds for a sequence of bags of characters, c_char). This process is specified in the definition of a field for the words as below (in model/data.py, 31th line):

self.WORD = data.Field(batch_first=True, tokenize=word_tokenize, lower=True, include_lengths=True) Note that include_lengths is set to True to include the length information of a sequence of the input words.

luhua-rain commented 5 years ago

Thanks for your answer !!!!! Your model trained well on my data !! But when i predict(batch_size=1), i got that : start (400) + length (1) exceeds dimension size (400). when my batch_size > 1 , i can run , how can i solve it , i need to predict with batch_size=1