`THIndexTensor_(size)(target, 0) == batch_size' failed. at d:\projects\pytorch\torch\lib\thnn\generic/ClassNLLCriterion.c:54

Rajat-Mehta commented 7 years ago

I am trying to train my neural networks on dog breeds data set. After feed-forward, during the loss computation it throws this error :`THIndexTensor_(size)(target, 0) == batch_size' failed. at d:\projects\pytorch\torch\lib\thnn\generic/ClassNLLCriterion.c:54

criterion =nn.CrossEntropyLoss()
optimizer=optim.Adam(net.parameters(),lr=0.001)

for epoch in range(10):  # loop over the dataset multiple times
        running_loss = 0.0
        print(len(trainloader))
        for i, data in enumerate(trainloader, 0):
            # get the inputs
            inputs, labels  = data

            # wrap them in Variable
            inputs, labels = Variable(inputs).float(), Variable(labels).float().type(torch.LongTensor)

            # zero the parameter gradients
            optimizer.zero_grad()

            # forward + backward + optimize
            outputs = net(inputs)

            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            # print statistics
            running_loss += loss.data[0]
            if i % 2000 == 1999:    # print every 2000 mini-batches
               print('[%d, %5d] loss: %.3f' %
                      (epoch + 1, i + 1, running_loss / 2000))
                running_loss = 0.0

Error is generated in this line :

       loss = criterion(outputs, labels)

What's the issue ??

akurniawan commented 6 years ago

Could you help to provide the size of outputs and labels? Based on the error it looks like the first size of the target is not equal toe the first size of the prediction. Below are the full code of the error you're experiencing

    int batch_size = THTensor_(size)(input, 0);
    THAssert(THIndexTensor_(size)(target, 0) == batch_size);

mttk commented 6 years ago

@jekbradbury @akurniawan is this possibly due to the fact that the iterator's last batch returns a non-full batch size by default (examples % batch_size != 0)?

I'm thinking this is the problem due to this quote from #175

I know that there are 1026 data points in the validation dataset, and if I manually set the batch size as 6, this error doesn't pop up.

Since 1026 % 6 = 0

@Rajat-Mehta can you try setting your batch size accordingly and see if you can reproduce the error?

rian-dolphin commented 6 years ago

I had the same problem when trying to do simple regression in pytorch and found that the error went when I switched my criterion from nn.CrossEntropyLoss() to nn.MSELoss().

The reason behind it was that my network had only one output i.e. dimension of the model output was torch.Size([n, 1]) but the targets/labels I was comparing to was of size torch.Size([n]) which is not supported by the nn.CrossEntropyLoss() criterion because as can be seen in the error code below the number of class outputs from the model must be greater than the number of targets/labels (Cant be equal to).

The error code I got was: Assertion `cur_target >= 0 && cur_target < n_classes' failed. at /Users/soumith/minicondabuild3/conda-bld/pytorch_1524590658547/work/aten/src/THNN/generic/ClassNLLCriterion.c:97

cpuhrsch commented 5 years ago

@Rajat-Mehta - was this resolved / is still relevant?

pytorch / text

`THIndexTensor_(size)(target, 0) == batch_size' failed. at d:\projects\pytorch\torch\lib\thnn\generic/ClassNLLCriterion.c:54 #186