Hi! Your code is of great help to me and thanks to you so much.

When experimenting with the same test data of CNN in your repo, I compare the result with TF-IDF+LR I wrote and found out that the test data is not totally used in testing. And I write a version like ` with torch.no_grad(): for idx, batch in enumerate(val_iter): text = batch.Text[0] pass_batch_size= None #Since the batch size would be passed into the model, I set the variance to keep the data size in this iteration if (text.size()[0] is not batch_size): pass_batch_size = len(text) # pass_batch_size is the batch size of the batch target = batch.Label target = torch.autograd.Variable(target).long() if torch.cuda.is_available():

text = text.cuda()

            # target = target.cuda()
            text =text.to(device)
            target = target.to(device)
        prediction = model(text,pass_batch_size) $ pass into the model

` In LSTM.py, the model initializes the h_0 and c_0. I think it's the reason you define a parameter 'batch_size=None' in 'forward()'.

prakashpandey9 / Text-Classification-Pytorch

Data not matching the batch size are discarded #17

text = text.cuda()