Batch_size problem - Githubissues

Zzmonica commented 6 years ago

When I changing the batch_size into a number that bigger than 1 , the program then goes an error that : at train.py:train_model in the enumerate operation RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 3 and 2 in dimension 1

Microrpatorgui commented 6 years ago

Hello, I have an issue when I change the batch_size too : ... 1638 1639 1640 1641 1642 1643 Traceback (most recent call last): File "/export/livia/home/vision/descampsg/.pycharm_helpers/pydev/pydev_run_in_console.py", line 52, in run_file pydev_imports.execfile(file, globals, locals) # execute the script File "/export/livia/home/vision/descampsg/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/export/livia/home/vision/descampsg/foldertodeploy/main.py", line 51, in model = train_model(model, criterion, optimizer, dataloaders, scheduler, dataset_sizes, num_epochs=5) File "/export/livia/home/vision/descampsg/foldertodeploy/train.py", line 30, in train_model for i, data in enumerate(dataloaders[phase]): File "/export/livia/home/vision/descampsg/anaconda3/envs/py36densenet169/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 336, in next return self._process_next_batch(batch) File "/export/livia/home/vision/descampsg/anaconda3/envs/py36densenet169/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch raise batch.exc_type(batch.exc_msg) OSError: Traceback (most recent call last): File "/export/livia/home/vision/descampsg/anaconda3/envs/py36densenet169/lib/python3.6/site-packages/PIL/ImageFile.py", line 219, in load s = read(self.decodermaxblock) File "/export/livia/home/vision/descampsg/anaconda3/envs/py36densenet169/lib/python3.6/site-packages/PIL/PngImagePlugin.py", line 620, in load_read cid, pos, length = self.png.read() File "/export/livia/home/vision/descampsg/anaconda3/envs/py36densenet169/lib/python3.6/site-packages/PIL/PngImagePlugin.py", line 115, in read length = i32(s) File "/export/livia/home/vision/descampsg/anaconda3/envs/py36densenet169/lib/python3.6/site-packages/PIL/_binary.py", line 77, in i32be return unpack_from(">I", c, o)[0] struct.error: unpack_from requires a buffer of at least 4 bytes During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/export/livia/home/vision/descampsg/anaconda3/envs/py36densenet169/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/export/livia/home/vision/descampsg/anaconda3/envs/py36densenet169/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in samples = collate_fn([dataset[i] for i in batch_indices]) File "/export/livia/home/vision/descampsg/foldertodeploy/pipeline.py", line 54, in getitem image = pil_loader(study_path + 'image%s.png' % (i+1)) File "/export/livia/home/vision/descampsg/anaconda3/envs/py36densenet169/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/datasets/folder.py", line 130, in pil_loader return img.convert('RGB') File "/export/livia/home/vision/descampsg/anaconda3/envs/py36densenet169/lib/python3.6/site-packages/PIL/Image.py", line 892, in convert self.load() File "/export/livia/home/vision/descampsg/anaconda3/envs/py36densenet169/lib/python3.6/site-packages/PIL/ImageFile.py", line 224, in load raise IOError("image file is truncated") OSError: image file is truncated

the batch_size was 10. I have the same problem with 50 or 500. Thanks for help

Microrpatorgui commented 6 years ago

Zzmonica did you find a solution for your case ?

pyaf commented 6 years ago

Hi Guys, sorry for late reply. As of now, the dataloader takes up a study at a time. As each study may have varying number of images, we can't have batch size>1.

Microrpatorgui commented 6 years ago

Thank you for you answer, I thought it was why i can't reach your results. Currently i got a accuracy of 54% on the validation set of the wrist instead of 75%. I don't know why i have this issue. Any idea ?

Zzmonica commented 6 years ago

So you ensemble 7 trained models to evaluate? For that your training dataset is just one study type.If not, how can you train the valid set with only one model?

pyaf commented 6 years ago

Yeah, ensemble is an option. Else you can modify the datapipeline to do image level training (instead of study level).

Zzmonica commented 6 years ago

Yeah,I'm going to do that job ~

Zzmonica commented 6 years ago

Microrpatorgui , have you improve the accuracy ?

Microrpatorgui commented 6 years ago

Yes by taking picture one by one instead of study by study and taking a batch size of 8.

I also trained the network on the entire database.

Zzmonica commented 6 years ago

I did that too. So what's your accuracy on validation set? Did you change anything?

Microrpatorgui commented 6 years ago

yes 85% on the validation set of the wrist

Zzmonica commented 6 years ago

wow～ so~~good！ so you just valid the wrist set? have you change anything? my performance was awful..

Microrpatorgui commented 6 years ago

yes just for the wrist, i used BCELoss

Zzmonica commented 6 years ago

Have you change the bceloss's parameters with the corresbonding study type's normal and abnormal proportion?

Zzmonica commented 6 years ago

@Microrpatorgui I knew! Thanks!

pyaf / DenseNet-MURA-PyTorch

Batch_size problem #8