zoogzog / chexnet

Implementation of the CheXNet network (PyTorch)
211 stars 94 forks source link

CUDA Memory Error #19

Open yan-michael opened 5 years ago

yan-michael commented 5 years ago

I am getting an error: RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 1; 11.17 GiB total capacity; 10.68 GiB alread\ y allocated; 64.00 KiB free; 188.49 MiB cached)

for the epochVal method. Have tried editing batch size, still no solution. Please help.

dgrechka commented 5 years ago

I have the same

dgrechka commented 5 years ago

I suspect, you may get the error because you are running code on pyTorch version newer than 0.3.1 In this case you need to update the code and to use torch.no_grad: instead of volatile=True flags. For me it solved the issue.

searobbersduck commented 4 years ago

I am getting an error: RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 1; 11.17 GiB total capacity; 10.68 GiB alread\ y allocated; 64.00 KiB free; 188.49 MiB cached)

for the epochVal method. Have tried editing batch size, still no solution. Please help.

I am getting an error: RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 1; 11.17 GiB total capacity; 10.68 GiB alread\ y allocated; 64.00 KiB free; 188.49 MiB cached)

for the epochVal method. Have tried editing batch size, still no solution. Please help.

I edit the function epochVal like this:

def epochVal (model, dataLoader, optimizer, scheduler, epochMax, classCount, loss):

        model.eval ()

        lossVal = 0
        lossValNorm = 0

        losstensorMean = 0

        for i, (input, target) in enumerate (dataLoader):
            # target = target.cuda(async=True)

            # varInput = torch.autograd.Variable(input, volatile=True)
            # varTarget = torch.autograd.Variable(target, volatile=True)    

            varOutput = model(torch.autograd.Variable(input.cuda()))
            varTarget = torch.autograd.Variable(target.cuda())

            losstensor = loss(varOutput, varTarget)
            # print(losstensor)
            # losstensorMean += losstensor
            losstensorMean += losstensor.data

            # print(losstensor.data)
            # lossVal += losstensor.data[0]
            lossVal += losstensor.data
            lossValNorm += 1

        outLoss = lossVal / lossValNorm
        losstensorMean = losstensorMean / lossValNorm

        return outLoss, losstensorMean

Hope to help you!

Now I am sorting out the data set and solution related to the x-ray chest radiograph. I hope to do some product-level applications. My project will be open sourced under this path:searobbersduck/XRaySolution