bentrevett / pytorch-sentiment-analysis

Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
MIT License
4.38k stars 1.17k forks source link

invalid argument 5: kernel size should be greater than zero, but got kH: #36

Closed HeoYoon closed 5 years ago

HeoYoon commented 5 years ago

Hi, I'm new to pytorch. I followed your code and tried to make Korean version sentiment analysis. But, I got this error.

RuntimeError Traceback (most recent call last)

in 8 9 train_loss, train_acc = train(model, train_iterator, optimizer, criterion,) ---> 10 valid_loss, valid_acc = evaluate(model, valid_iterator, criterion) 11 12 end_time = time.time() in evaluate(model, iterator, criterion) 10 for batch in iterator: 11 ---> 12 predictions = model(batch.text).squeeze(1) 13 14 loss = criterion(predictions, batch.label) /anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs) 487 result = self._slow_forward(*input, **kwargs) 488 else: --> 489 result = self.forward(*input, **kwargs) 490 for hook in self._forward_hooks.values(): 491 hook_result = hook(self, input, result) in forward(self, text) 21 #embedded = [batch size, sent len, emb dim] 22 ---> 23 pooled = F.avg_pool2d(embedded, (embedded.shape[1], 1)).squeeze(1) 24 #pooled = [batch size, embedding_dim] 25 RuntimeError: invalid argument 5: kernel size should be greater than zero, but got kH: 0 kW: 1 at /Users/soumith/mc3build/conda-bld/pytorch_1549312653646/work/aten/src/THNN/generic/SpatialAveragePooling.c:14 --------------------------------------------------------------------------- I changed the code a little because I'm Korean, so I modified it into Korean version. Like this: from soynlp.tokenizer import MaxScoreTokenizer from soynlp.normalizer import * from konlpy.tag import Okt import re def tokenizer(text): # create a tokenizer function okt = Okt() review_text = re.sub("[^가-힣ㄱ-ㅎㅏ-ㅣ\\s]", "", text) x = okt.morphs(review_text , stem= True) return x def generate_bigrams(x): n_grams = set(zip(*[x[i:] for i in range(2)])) for n_gram in n_grams: x.append(' '.join(n_gram)) return x TEXT = data.Field(tokenize = tokenizer, preprocessing = generate_bigrams, stop_words = stop_words) LABEL = data.LabelField(dtype = torch.float) I think the problem is in validation...... because when I added #(I mean Annotation processing) in front of code that is related to validation. And then I run it, it worked well.
bentrevett commented 5 years ago

I don't think I have seen this error before, but it might be due to the tensors coming into the average pool layer being incorrect.

Can you add print(embedded.shape) before the pooled = F.avg_pool2d(embedded, (embedded.shape[1], 1)).squeeze(1) line and make sure the tensor is the correct size? It should be [batch size, sentence length, embedding dim].

HeoYoon commented 5 years ago

Thank you!!!!!!! I sovled it^^